Citation Cross-reference problem

Get help using and writing Nisus Writer Pro macros.
Post Reply
User avatar
phspaelti
Posts: 1313
Joined: 2007-02-07 00:58:12
Location: Japan

Citation Cross-reference problem

Post by phspaelti »

In the thread http://www.nisus.com/forum/viewtopic.php?f=18&t=11866 Þorvarður asks about how to do build an automatic citation using the AIAA Citation Style.
Using the referenced style guide as input I have created the following test file:
Xref Citations Test.rtf
(28.11 KiB) Downloaded 545 times
Using this style I will discuss how to approach this problem. Þorvarður gives a nice description of how to do this step by step:
  1. Find all the references which are marked by bracketed numbers [1] at the beginning of the line
  2. Bookmark the references
  3. Find the citations in the text
  4. Cross-reference them
The first two steps are easy enough to do by hand, and would be just as easy to do as a macro:

Code: Select all

Find All '^\[\d+\]', 'Ea'
Add Bookmarks
But an issue that comes up here is how cross-references work. Since we are going to need to use the bookmarked text to be used as the cross-reference we will need to worry about the format of the citation references in the text. The AIAA specification allows all of the following cases:
  • [4]
  • [4, 5]
  • [4-7]
  • [4, p.35]
  • [4, pp.25-28]
In all of these cases we will want to cross-reference the [4] to the corresponding reference in the reference list. But a first question is what to do about the the "5" in [4, 5] and the "7" (and the implied "5" and "6") in [4-7]. Since such cases are presumably always sequential, it would seem sufficient to only cross-reference the first number in each of these cases, as clicking on this first number would take you close enough. That is I assume there are no instances of references [4, 12, 31] and the like.

Once this has been cleared up, we can be more specific about the relevant task; from the reference we will need to bookmark just the actual number, since if we bookmarked "[4]" as a whole we would end up with things like this "[4]-7]" or "[4], p. 35]".

A next technical issue comes up when creating the cross-reference; we will need some way to get the relevant bookmark. If we create all bookmarks all at once it will hard to know for sure how to access the bookmark for the reference "[4]" since the name for the bookmark will be auto-generated. Thus to solve this it will be better to create the bookmarks one at a time and save them in an Array or Hash as we create them, so we can recall the relevant bookmark when we need it.
Thus we should modify the above code as follows:

Code: Select all

$doc = Document.active
$referenceSels = $doc.text.find '(?<=^\[)\d+(?=\])', 'Ea'
$references = Array.new
foreach $sel in $referenceSels
    $num = $sel.substring
    $bookmark = Bookmark.newAutonamedAddedToTextSelection @false, $sel
    $references[$num] = $bookmark
end
So first I have modified the find expression to select only the number inside the bracket. This also give us a handy reference to the bookmark, which are stored in an Array called "$references". The bookmark to reference "[4]" will be stored in "$references[4]".

With this problem solved we can now consider how to insert the cross-references.
Inserting cross-references in the macro language is a two-step process.
  1. Create the Cross-reference object
  2. Insert it into the text using ".insertAtIndex"
Added to this are two steps; first, locate the references in the text, and second, remove the reference from the text (so we can insert the cross-reference instead.
philip
User avatar
phspaelti
Posts: 1313
Joined: 2007-02-07 00:58:12
Location: Japan

Re: Citation Cross-reference problem

Post by phspaelti »

So after a short break.

1. Find the references in the text. I am going to assume that we want to find all numbers in brackets that are not at the beginning of the paragraph. Basically:

Code: Select all

Find All '(?<=.\[)\d+(?:=\])', 'Ea'
As mentioned before, however we also need to allow for other formats. One could write a complicated expression to cover all the possible cases. Or one could just leave off the closing bracket. We are only ever interested in the first number, but that might still be a bit too permissive. A happy medium might be this:

Code: Select all

Find All '(?<=.\[)\d+(?=\]|[-,][-, \dp\.]+\])', 'Ea'
So a number preceded by a character and a bracket, and followed by either (a) a bracket, or (b) a comma or a hyphen and then some combination of digits, space, "p", ".", comma or hyphen before a closing bracket.

My general approach to such tasks is to construct the find expression in the find box, using a test file with the relevant cases. Then when I am happy I macroize (Copy to Clipboard) and then change it to the macro construct:

Code: Select all

$textReferenceSels = $doc.text.find '(?<=.\[)\d+(?=\]|[-,][-, \dp\.]+\])', 'Ea'
The macro construct has the advantage that it directly returns an array of text selections, which are useful for further processing.

2. Our next step is to delete these text references and in their place insert the crossreferences. This will be done in a loop and since we are changing the text, it is best to work backwards:

Code: Select all

foreach $sel in reversed $textReferenceSels
    # Create the cross-reference
    $num = $sel.substring
    $xref = CrossReference.newWithBookmark $references[$num]
    # Delete the text reference
    $sel.text.deleteInRange $sel.range
    # Insert the cross reference and set the type
    $sel.text.insertAtIndex $sel.location, $xref
    $xref.setReferenceType "BookmarkedText"
end
Gotta run.
----
As anyone who actually tried to follow along may have noticed I was in a bit of hurry when I wrote this. So I didn't have time to edit this properly. Some errors have now been fixed.
philip
Þorvarður
Posts: 410
Joined: 2012-12-19 05:02:52

Re: Citation Cross-reference problem

Post by Þorvarður »

Thank you Philip for the step-by-step instruction of how to join the two parts. If I come across any errors I'll let you know. I'll have to study this carefully. You are using commands that I'm not yet familiar with (substring, deleteInRange, insertAtIndex), and I have forgotten why you always do things in reversed order.

I upload the macro for the original poster so that he can install it directly.

Þ.
Attachments
Hyperlink reference to bibliography [Philip Spaelti].nwm.zip
(2.7 KiB) Downloaded 518 times
credneb
Posts: 187
Joined: 2007-03-28 07:30:34

Re: Citation Cross-reference problem

Post by credneb »

I always find Philip's tutorials educating.

As to [quote]I have forgotten why you always do things in reversed order.[/quote]

unless memory fails, it is because making changes in the text from top-down can change the location of text below where the change is made, and that can create problems. So by going from end to beginning, the changes made won't affect the preceding text.

At least that's how I have understood it.
User avatar
phspaelti
Posts: 1313
Joined: 2007-02-07 00:58:12
Location: Japan

Re: Citation Cross-reference problem

Post by phspaelti »

Þorvarður wrote: 2021-06-21 05:42:47 You are using commands that I'm not yet familiar with (substring, deleteInRange, insertAtIndex), and I have forgotten why you always do things in reversed order.
The key to such macros is the thing NWP calls a TextSelection object. This isn't really a difficult concept. A TextSelection—I'll just call it a selection from now on—is pretty much the analog to the thing that you can create with the cursor; it's a run of characters out of larger text. The way NWP handles these is by remembering, as a unit, the starting point, the length, and the text (object) on which the selection is made, or maybe I should say intended. That's because NWP doesn't actually make the selection. So it's more of a "potential" or "virtual" selection.

So let's consider an example. Imagine I had the following text:

This is an apple. Would you like an apple?

Now let's compare the following two commands:

Code: Select all

Find All 'apple'
$doc.text.findAll 'apple'
The first will select and "hilite" the two instances of the word "apple".
The second will not visibly do anything. Instead it will return two such "virtual" selections. As already noted these selections are determined by starting point, length, and text (object). The starting point of the first instance of "apple" is 11, the length 5, and the text is a Nisus internal identifier for this text object. The second instance is pretty much the same, except that the starting point is 36.

One point to keep in mind with this is that the text of a selection is always the entire (larger) text on which the selection is being made. The part that is actually selected—the word "apple" in this case—is the subtext. The two numbers that pick the selection out of the larger text can be viewed as a unit, in which case we call them the range.

The neat part of all of this is that we can now manipulate the larger text using this information. Nisus text objects have a large number of commands that allow us to edit the text using such information. .deleteInRange will logically enough delete the part of the text that is specfied by the range. So if we use the range of the first instance of "apple", then the .deleteInRange command would result in the following text:

This is an . Would you like an apple?

Similar commands can be used to replace parts, add parts, etc.

Okay, now why do we need to edit such text objects in reverse order? The "weak" point of such selections is that they really just consist of such range information, and the range really just consists of numbers. The selection is "virtual". If I wait too long, the text object might change, and then the range numbers might be "off", and no longer point to the part of the text object originally intended.

Imagine I replaced the first instance of "apple" with "banana". Now my whole text object will be one character longer (because "apple" has 5 letters and "banana" has 6). But this also means that the starting point of the second instance of "apple" will change from 36 to 37. Conversely the range defined by the numbers 36 and 5 will now point to the subtext " appl" rather than the intended "apple". So we will want to work backwards, because changes always push towards the end. If we replace the second instance first, it will still be in the correct location. And even though the overall length of the text changes, the first instance will still be in the same spot as before.

We might call this "don't look back" editing mode, the "Louis XIV principle" (Après moi le deluge!).

The last small point is the .substring command. This is just the same as subtext, except that we ignore the formatting. One could have used either. But in general if the formatting is not germane to the task, I use the string variant.

Hope this makes things a bit clearer.
philip
Post Reply