Glossary of Useful Terms

This document uses a number of terms that either may not be familiar to the “casual” user or are used in a technical way unique to word processing. We offer a small glossary of terms here . If you encounter other words that you believe should be included, please send them to <profeedback@nisus.com>

64-bit computing
In computer architecture, 64-bit computing is the use of processors that have datapath widths, integer size, and memory addresses widths of 64 bits (eight octets). Also, 64-bit CPU and ALU architectures are those that are based on registers, address buses, or data buses of that size. From the software perspective, 64-bit computing means the use of code with 64-bit virtual memory addresses. You can learn more at the Wikipedia article.

ASCII “American Standard Code for Information Interchange”
ASCII was used for many years to represent all the alpha-numeric, punctuation, and similar characters you stored on your computer. ASCII could only display the standard Roman character set, so, over the years various “kludges” developed, among them Apple’s WorldScript technology which used the upper range (from 128-255) to display non-Roman characters.
Nisus Writer Classic worked seamlessly with this system. However, problems remained when users tried to exchange documents with more than one script system in them.
In 1986, only two years after the Macintosh was released, engineers at Xerox started working to create a single font that would display the identical characters shared by Japanese and Chinese. This lead to early discussions of “Han Unification”. Simultaneously, based on issues related to Apple File Exchange, engineers at Apple began looking into the possibility of a “universal character set”. These efforts, and others, lead to the development of the Unicode Consortium. According to the
Unicode Consortium
Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language.
The first 127 codes for Unicode are the same as ASCII.

baseline In some writing systems (such as Devanagari and formal Hebrew), the letters seem to hang from an imaginary line, however, in Roman based scripts the letters rest on, or descend below an imaginary line called a baseline.

Boolean Named for George Boole, who developed a general method of symbolic reasoning that lead to the idea that “on/off” (“true/false”, “yes/no”, “1 or 0” circuits with relays could solve certain algebraic problems. This is the concept that supports the possibility of digital computers.

combining characters
Also known as enclosed alphanumerics were originally intended for use as bullets for lists. You can learn more at the
Wikipedia article.

diacritical mark Also simply diacritics are glyphs added to a letter. Different diacritics are specific to both Latin (Roman) and non-Latin writing systems. For more information see the Wikipedia article.

dingbat Not to be confused with Edith Bunker, A dingbat is an ornament used in typesetting, sometimes called a "printer's ornament". Often used to describe fonts with symbols and shapes in positions ordinarily held by alphabetical or numeric characters. The Unicode dingbat plane is: U+2700–U+27BF.

file path According to the Wikipedia, “A path is the general form of a file or directory name, giving a file’s name and its unique location in a file system. Paths point to their location using a string of characters signifying directories, each path component separated by a delimiting character, most commonly the slash “/” or backslash character “\”, or colon “:” though some operating systems may use a different delimiter. Paths are used extensively in computer science to represent the directory/file relationships common in modern operating systems, and are essential in the construction of Uniform Resource Locators (URLs).”

Flesch-Kincaid Grade Level and Flesch Reading Ease
Rudolf Flesch was trained as a lawyer in his native Vienna. He came to the United States in 1938 where he received his Ph.D. at Columbia University. Among his numerous books are The Art of Plain Talk, Say What You Mean, The Art of Clear Thinking, and The Art of Readable Writing (published by Harper and Row). This explanation of how to determine the reading ease of a text was culled from The Art of Readable Writing. There are two aspects to readability: ease of reading and interest. The first is determined by the structure of words and sentences; the use of “personal words” or “personal sentences” constitutes the other. The Flesch reading ease score found in the window that appears when you choose the menu command: File > Text Analysis… represents the first of these and is determined by the following steps:
1. Nisus Writer Pro counts the words in the entire document. Contractions, hyphenated words and numbers (such as dates and dollar amounts, etc.) are each counted as one word.
2. Nisus Writer Pro then computes the average sentence length, where colons, semicolons, and newline characters are not considered to terminate/delimit a sentence.
3. Nisus Writer Pro counts the syllables in the entire document. This syllable counting uses a special algorithm which, although not perfect, is quite accurate for the English language. It divides the number of syllables by the number of words and multiplies by 100 to get the number of syllables per 100 words. Although the syllable number in numbers and symbols should be counted as if these were pronounced when read aloud (i.e. 1991—nineteen ninety one = 2, 2, 1 syllables) this is not done in Nisus Writer Pro at this time.
4. Nisus Writer Pro multiplies the average number of syllables per word multiplied by 84.6. The result is the Reading Ease Score. The scale ranges from 0 to 100. The higher the score, the easier it is to read.
The formulas:
Flesch:
206.835 - (1.015 * averageWordsPerSentence) - (84.6 * averageSyllablesPerWord)
FleschKincaid:
(0.39 * averageWordsPerSentence) + (11.8 * averageSyllablesPerWord) - 15.59

FOG Index scores
This is a “reading grade level” score. It is calculated by determining the average sentence length plus the percentage of long words. All of this multiplied by 0.4. A long word is one with more than two syllables.
Its formula is:
F=((avg. # of words per sentence)+100*(long words per words))*.4

font or typeface According to the Wikipedia, “a typeface is a coordinated set of glyphs designed with stylistic unity. A typeface usually comprises an alphabet of letters, numerals, and punctuation marks; it may also include ideograms and symbols, or consist entirely of them, for example, mathematical or map-making symbols. The term typeface is often conflated with font, a term which, historically, had a number of distinct meanings before the advent of desktop publishing; these terms are now effectively synonymous when discussing digital typography. A helpful and still valid distinction between font and typeface is a font's status as a discrete commodity with legal restrictions, while typeface designates a visual appearance or style not immediately reducible to any one foundry's production or proprietary control.”

glyph The shape of the basic unit in any written language. These include letters, numerals, punctuation marks, as well as Chinese and Japanese characters.

grapheme The basic unit in any written language

gremlin Loosely “gremlins” are any non-printing characters that serve no useful purpose. Currently Nisus Writer Pro defines the following code points as gremlins:

U+0000 to U+0008 ASCII Null to Backspace

U+000B Vertical Tab

U+0014 to U+0017 ASCII Shift Out to Unit Separator

U+0082 to U+0083 ASCII Break Permitted and Negation

U+0086 to U+009F ASCII Start Of Selected Area to Application Program Command

U+E000 to U+F700 Private use area, which is technically from E000–F8FF
Apple has assignments starting at F700

GREP GREP is an acronym which means “search globally for lines matching the regular expression, and print them”. The important part here is “regular expression”. Using GREP you can search for “text patterns” (regular expressions: series of numbers, or series of letters, etc.). Nisus Writer Pro uses a variant of GREP and makes it available to its users in a menu-driven from that use human language and visual cues.

gzip A widely supported compression/archive format.
The new "Nisus Compressed Rich Text" format (file extension “zrtf”) is a way to reduce the size of files that NWP saves. It’s basically the normal Nisus Writer Pro RTF, zipped, and saved to disk. No other application understands this, but it reduces file sizes by a large amount. However, in a pinch a user can rename a file from "whatever.zrtf" to "whatever.rtf.zip" and let the Finder expand it into a normal RTF file

ideogram/ideograph
A graphic symbol like an icon on your Macintosh Desktop, or the buttons you click in the Nisus Writer Pro interface. These represent an idea, rather than a group of letters arranged according sounds they might represent in a spoken language. Some writing systems (notably those of East Asia (and the Hieroglyphics of ancient Egypt)) are considered “
ideographic” even though many of the symbols in these systems represent words or small bits of meaning, rather than complete ideas.

kerning The process of adjusting spacing letter pairs in a proportional font (see also ligature and tracking).

leading During the period of moveable type, small strips of lead were placed between the lines of text in order to increase the space and readability. This artifact gave its name to what is now often referred to as “line spacing”. Leading (which refers to vertical spacing) should not be confused with tracking, which refers to the horizontal spacing between letters or characters.

ligature A complex glyph created when multiple letter-forms join into one, usually replacing two sequential characters that a common component such as the ascender of an f becoming the dot of the i that follows it: fi = fi. Not all fonts support ligatures (see also kerning and tracking).

metacharacter Any character that has a meaning other than its literal meaning; in particular for work with GREP.

orphans and widows
The easiest way to remember the difference between and an orphan and a widow is to remember that orphans are “left behind” and widows are forced to “go on ahead by themselves” just as an orphan or widow in life. Orphans are separated segments of text at the beginning of a paragraph or sentence while widows are separated segments of text at the end of a paragraph or sentence.

pathname See “file path”.

script a “writing system”.

Sandbox … as in “every application (read “child”) plays in its own sandbox and does not use other children’s toys. More, from the Wikipedia article “Sandbox (computer security)”: In computer security, a sandbox is a security mechanism for separating running programs. It is often used to execute untested code, or untrusted programs from unverified third parties, suppliers, untrusted users and untrusted websites.

text attributes/formatting
During the days of the Classic Macintosh OS, almost all applications had a Style menu. The commands of this menu applied attributes to the text. In the Mac system (macOS), the standard for applications is to have a Format menu through which a wide variety of formatting characteristics can be applied to the text. These include things that would have been considered “styles” applied to individual characters as well as controls that affect the shapes of paragraphs as well.

text encoding At its core, the computer recognizes only ones and zeros. In order to display all of them as text and images in your document and store them appropriately on your hard drive, as well as send them to others so that they can read what you have created, the computer needs to “convert” those digits back and forth between long strings of ones and zeros… and more humanly-recognizable symbols. A variety of ways of converting your text exist. These are called Text Encoding methods. Among these are ASCII and Unicode.

tracking Also known as letter spacing, or character spacing, tracking refers to the space between all the letters of a word (see also kerning and ligature).

transliteration A type of conversion of a text from one script to another involving swapping letters. This is not to be confused with transcription which swaps for sounds.

twip A twip (abbreviating “twentieth of a point”, “twentieth of an inch point”, or “twentieth of an Imperial point”) is a typographical measurement, defined as 1/20 of a typographical point. One twip is 1/1440 inch or 17.639 µm

Unicode All computers know only numbers (essentially “0” and “1” and a near-infinite number of combinations of those. To store letters, you have to assign them numbers. For example, 1=A, 2=B, and so on. In the old days, you could use different “encodings” that assigned different numbers to different letters depending on the language you were working in, etc. One example of this is ASCII.
Unicode is an encoding just like ASCII, Latin ISO-1, etc. However, it assigns a number to virtually every letter (and diacritic) for nearly every alphabet on the planet past or present.
This is really useful especially when mixing characters from lots of different languages. In
Nisus Writer Classic, the only way to store letters from different alphabets is to use different encodings. When Nisus Writer Pro reads in the file (which can contain only numbers, remember), it first has to figure out what encoding you used so Nisus Writer Pro can match it to the right letter.
Nisus Writer Classic format decided what encoding to use based on the
font you applied. If your font is not available in the new System, Nisus Writer Pro tries to guess. Sometimes it works, a lot of times it doesn’t. On the Mac system, fonts sometimes don’t show up or work as they did in OS 9. This is the primary reason people see garbage text when they try to open Classic files.
Unicode, on the other hand is much better. The number 65 always means capital “A” for example no matter what. So your text is better preserved, and its is far simpler for Nisus Writer Pro to deal with multilingual text.

UTF-16 A “Unicode Transformation Format” that uses 16 bits to hold the information about the characters in your document.

UTF-8 A “Unicode Transformation Format” that uses 8 bits to hold the information about the characters in your document.

WYSIWYG What You See Is What You Get
Used to describe something where the content during editing appears very similar to the final product.


Previous Chapter
Appendix II
<<  index  >>
 
Next Chapter
Appendix III