Fixing hyphenated text?


#1

When creating note cards using the Text Excerpt function, often the original quotation contains words broken by hyphens at the end of a line. MarginNote reproduces this, which means in effect that a single word gets broken in two. This means I can’t easily search the mindmap for keywords.

For example, if the original text has the word “pre- determined”, this also appears in the note and I can’t find it by searching for “predetermined”. Of course I can go through and delete all the hyphens by hand, but this is sort of tedious, especially when the spelling checker already “knows” these words are broken.

Is there some setting that can tell MarginNote to automatically fix up these broken words?


#2

I investigated this a little bit. OS X has a text replacement feature in “System Preferences > Keyboard > Text”, but it refuses to accept "- ". Idk why, but that’s how it is. There is also a “smart dashes” feature but that is for changing double dashes to em-dashes. Upshot: there is seemingly no built-in OS X text service for cleaning up this very common issue.

I hacked up an OSX Automator script that runs sed ‘s/- //g’ as a shell script, but this still means selecting each quotation that I create with MarginNote, and then running the script on it.

Since MarginNote is creating the note card by copying the selected text from the PDF, and since the selected text will be on separate lines, IMHO the best way to attack this would be with a new option inside MarginNote that checks for "- " at the end of a line, and removes it. This feature could be enabled/disabled via a Preference setting, and would (for me, at least) greatly improve the workflow for creating searchable notecards.


#3

I don’t have the immediate information at hand. I would think however that a text service can be created to run the sed script on the selection.

Also, maybe the “- “ sequence needs to be escaped to be recognized?


JJW


#4

I googled for information about escaping characters in “System Preferences > Keyboard > Text” text replacements, but I can’t find anything. RegEx doesn’t work. If the “With” field is empty (i.e., replace something with nothing), the rule just gets ditched.

Also, I’ve noticed that the “Replace” field cannot include a space. OSX will open a warning about this.

I even tried editing a plist file directly and then dragging it into the Text Replacements field, but it just gets ignored.

It would be nice to do this with OS X, but it seems their service design is just too limited.

EDIT: Upon further experimentation, the replacements in “System Preferences > Keyboard > Text” only seem to work when entering text with the keyboard, not when copying and pasting text. So, again, unless there is some deeper text service in OS X that we can fiddle with, enhancing MarginNote seems like the best option to me.


#5

Would you know RegEx well enough to capture the one character before the hyphen + the hyphen. You could then replace that with the one character before the hyphen.


JJW


#6

Do you mean a RegEx expression in “System Preferences > Keyboard > Text” text replacements?

I was unable to get that to work. Also, the text replacement only seems to be applied when typing from the keyboard.

AFAICT, the intervention needs to happen when MarginNote copies the lines of text selected in the PDF file, and then concatenates them to create the text for a notecard. I.e., the code inside MarginNote needs to be modified to strip the hyphen+space at the end of each line.