Thursday, October 16, 2008

New Cookbook Blog

I've started a new blog which will be mainly like a 'cookbook' for Adobe Creative Suite scripters - how-to articles that show how to write much more powerful scripts.

http://www.rorohiko.com/wordpress

The first entry in the blog is up - it's about how you can 'attach' little icons or labels to the four sides of page items as a means of feedback to the user of your scripts. The sample script adds a little floating label with the word count to each text frame - so the user can see the amount of words in each text frame in the blink of an eye.

Check it out!

Cheers,

Kris

Saturday, July 12, 2008

Using State Machines: Web Access From Adobe InDesign CS3 ExtendScript - click to listen to podcast

Sample files for this podcast can be downloaded from:

http://www.rorohiko.com/podcast/geturl.zip

This podcast will explain how you can query web services from within InDesign CS3 ExtendScript. No need for plug-ins, external libraries - just Adobe InDesign ExtendScript, pure and simple.


>>> Edit#3:
Adjusted the script to handle redirections, by interpreting the HTTP 1.0/301 Moved Permanently return status.
<<<


>>> Edit#2:
Adjusted the script to give much faster downloads in case the Content-Length header is not present in the web server headers. Also changed the protocol to HTTP/1.0 instead of HTTP/1.1 to sidestep the issue of 'chunked' downloads - support for 'chunked' HTTP is left as an exercise.
<<<


>>> Edit#1:
With regards to InDesign CS5: the downloadable code won't work as-is with CS5 due to some oddity with the String.replace function.

To work around the issue, you need to change the ParseURL function to read:

...
function ParseURL(url)
{
url=url.replace(/([a-z]*):\/\/([-\._a-z0-9A-Z]*)(:[0-9]*)?\/?(.*)/,"$1/$2/$3/$4");
url=url.split("/");

// ADD THE LINE BELOW FOR INDESIGN CS5
if (url[2] == "undefined") url[2] = "80";

var parsedURL = 
...

That fixes it up so it works with CS5 too...

<<<

I'll also present some useful routines I wrote, called GetURL() and ParseURL(). GetURL() is a fairly large routine which demonstrates how you can use a programming pattern called 'a state machine' to process data on-the-fly as it is received from a network connection.

To demonstrate how to use the GetURL() function, I've also added a useful sample script. The script will search your active InDesign document for any page items that have a URL as their script label (entered via Window - Automation - Script Label). It will then fetch the data 'behind' the URL and place that data into the page item.

Install the script GetURLs.jsx in the InDesign scripts folder (the easiest is to bring up the scripts palette from InDesign: select Window - Automation - Scripts, and then right-click the User folder).

Select Reveal in Finder or Reveal in Explorer. Then copy the script into the Scripts Panel folder you should now see appear.

Switch back to InDesign, and open up the sample document GetURLSample.indd. Run the script GetURLs.jsx from the palette by double-clicking it on the palette. The empty frames should fill up with images or text (at least, if you are connected to the Internet).

So, how does it all work?

At the heart of it all is the standard ExtendScript object called Socket. More info about the Socket can be found in the JavaScript Tools Guide for CS3:

http://www.adobe.com/devnet/bridge/pdfs/javascript_tools_guide_cs3.pdf

The socket object gives us the ability to perform low-level network communications - we can set up TCP/IP connections with other computers on the network.

The problem is that the Socket object has no higher-level functionality - it has no support for any protocols, like HTTP for example.

To fix that, you could try and use the protocol support that is available via Bridge with the HttpConnection object. You could also forcibly give InDesign access to the webaccesslib that is used by Bridge, through some 'fiddling around'.

However, personally I am not too keen on either of these approaches - they're either a bit too big or too brittle to my liking; I wanted to have an 'InDesign all by itself' solution.

The alternative approach I used was to provide HTTP support in pure ExtendScript to InDesign.

Now, before diving into this: be warned, this is NOT a fully fledged, fully compliant HTTP client. I've only implemented a subset of the protocol, just enough to let me do what I needed to do.

For example, the code only supports UTF-8 text encoding. If your target web server does not offer that you'll have to add some additional code to the scripts to cope with that. Also, I've only implemented HTTP 'GET' requests, not 'POST'. Adding that functionality would be fairly easy to do - it's left as an exercise.

So, the Socket object allows us to send out requests and receive replies via TCP/IP.

The HTTP protocol is quite extensive, but the basis of it is simple - it is mainly a plain text-based, ping-pong protocol. You send out a request, and you get a reply.

The (incoming) reply is composed of three parts: a start (or status) line, zero or more header lines, and then an optional body (which can be binary data or text).

The (outgoing) request is similarly composed of a request line, and zero or more header lines, and an optional body (for POST requests, which I am not implementing here).

Immediately after the start line, you get zero or more header lines.

Header lines are themselves separated from the following body of the request or reply by an empty line - so a request or reply always has the same 'rhythm' to it: start line, header lines, empty line, body.

The GetURL() script code in the GetURLs.jsx implements this in a rudimentary fashion: the request is a simple multi-line text string, which is fired off to the web server via a Socket object.

Then the bulk of the code is for interpreting the reply from the web server - there are three levels of decoding that need to happen.

First of all, we need to decode the reply itself - separate the start/status line from the headers and from the body.

The headers also will contain some important information: the length of the body that will follow. So we need to interpret that header line in order to know exactly how many bytes to read from the socket.

At a lower level, we need to 'chop' the reply up into individual lines until we reach the body - the status line and the individual lines are separated by CRLF character pairs (CR = ASCII character 13, LF = ASCII character 10).

And finally, at the lowest level, while reading a text-based body, we need to interpret UTF-8 code and convert the UTF-8 codes into plain Unicode. This means that we need to read through codes that are 1, 2, 3 or 4 bytes long, and each of them encodes a single Unicode character.

In the GetURL() routine, these three levels of encoding are decoded concurrently, through three nested 'state machines'. Using state machines makes the code fairly fast - much faster than could ever be achieved through string matching.

I won't go into the intricate details of how GetURL() works - the script is fairly well documented and you should be able to figure out how it works by careful reading and stepping through it with the ExtendScript debugger.

Instead, I want to explain a little bit more about state machines - they are a very powerful technique for fast pattern matching and parsing, and once you 'get' them, they are easy to use. They are used in mechanisms like GREP, in compilers and interpreters, in all kinds of text parsers,...

All too often I see code that uses straight string functions to achieve some matching goals.

A simple example: you get some data thrown at you which has line endings that might be either CR (ASCII 13), LF (ASCII 10) or one of each (CRLF).

Many people will handle that by reading in all the data into a buffer, then do some pattern search-and-replace. For example,

first replace all CRLF with CR
then replace all LF with CR

After that, the line ending has become a CR throughout the text.

This approach is not necessarily the best. Especially if you receive the data character by character, you could do the clean-up on-the-fly, as you receive the data. No need to buffer or no global search-and-replaces - so it might greatly reduce the amount of memory you need and also run a lot faster to boot.

With a state machine, you would go about it as follows. First of all, you create a variable (say, myState), and you create some symbolic numerical constants (for example, kNormalState could be a symbolic name for 0, and kSeenCR could be a symbolic name for 1).

For more complex state machines there might be hundreds, even thousands of different states - but in this case, two states will do.

All we'll now do is play with a simple integer variable, and we'll keep track of where we're at by manipulating the state. The idea is that we don't assemble strings or 'memorize' any other input data - we encode the relevant info about 'what has been' into the state variable.

Data flows through our state machine - we read input data, and immediately get rid of the data - we write or store or process it - and we keep as little data as possible inside our state machine logic.

So, our little state machine is happily reading and writing character after character.

After each character we read and process we also check whether it was a CR (ASCII 13) or not, and we change our state to either kNormalState or kSeenCR.

Now, suppose we now read a line feed character (ASCII 10). Before doing anything with a new character, the state machine will always check its current state first.

If the state is kSeenCR we know that this is a line feed after a preceding CR, so we simply don't write the LF out.

If we read a LF and the state is kNormalState, we know that this is a 'stand alone' LF without preceding CR - so we output a CR character to replace it instead.

The state machine is just simple enough to express in words:

Initialize myState to kNormalState
read character (loop until end of file)

if character is LF then
if myState is NOT kSeenCR then
output CR
end if
else
output character
end if

if character is CR then
myState becomes kSeenCR
else
myState becomes kNormalState
end if

end loop

This might seem overkill, but the advantages of state machines become apparent when you try more complex things - for example, interpreting a quote JavaScript string. That string might contain escape-sequences (backslash, followed by a letter, or 1-3 octal digits). Properly interpreting such an 'escaped' string is hard work without a state machine. With a state machine it's a breeze, with hardly any overhead.

So, that was a quick introduction to state machines - I hope it was enough to pique your interest, and entice you to do a bit of research; once you've added them to your arsenal of techniques, you'll find that some difficult tasks have become a lot easier.

You can download the sample files from the following URL (which is mentioned in the podcast transcript):

http://www.rorohiko.com/podcast/geturl.zip

Sunday, March 30, 2008

Lightning Brain Podcast: Click here to listen to "How To Get Paid For Writing ExtendScripts For Adobe® InDesign®"

In this podcast, I'll be highlighting a single feature of Rorohiko's Active Page Items family of tools that can help you get paid for ExtendScripts you write for Adobe InDesign CS, CS2 or CS3.

To demonstrate how easy it is, I've first created an ExtendScript called FlontFipper. The script itself, and its name, are a bit tongue-in-cheek, but actually might be useful, who knows?

What the script does is look at the currently active document and determine which two fonts are used most often in the document. It then swaps these fonts.

For example, if your document is quite simple, and has headlines in Helvetica and body text in Times Roman, this script would make the headlines Times Roman, and the body text would become Helvetica.

To use the script, the end-user would install the script into the appropriate folder for InDesign scripts. Then he would open a document, and double-click the script name in the InDesign Scripts panel.

Now assume I would like to sell this script to InDesign users.

First of all, I need to download a copy of the APIDToolkit (Active Page Items Developer Toolkit). A functional, time-limited demo version of APIDToolkit is available from the Rorohiko web site.

This toolkit contains lots of stuff, but the single item I am interested in for this podcast is called the InDesignScriptCompiler. Mac and PC versions are provided in the downloadable archive file.

Second, my potential customer needs to install a special runtime plug-in (the APIDKernel) which will enforce the demo- and licensing restrictions I want to impose on the use of my FlontFipper script.

To protect my FlontFipper.jsx script, I simply drag/drop it onto the InDesignScriptCompiler icon, and fill in some parameters - mainly to tell APID what the limitations are for the demo mode and via which URL a license can be purchased from me.

The result is an encrypted and protected version of my script - FlontFipper.compiled.js - which I can then send to my prospective customer, the end user.

My end user then installs FlontFipper.compiled.js just like any normal script, and uses it just like he would use the non-encrypted original. Note that this is also supported on InDesign CS and CS2.

The main difference with an uncompiled script is that each time InDesign is restarted, the first time FlontFipper is used, a 'beg' dialog will appear, which will inform the end-user of the time-limited nature of the FlontFipper demo. There are also two buttons on the dialog: Get License... and Import License File....

If the end-user clicks Get License... his system's web browser will be directed to display the embedded URL, and it will pass through all the data I would need to create a personalized license file for my end-user.

I then proceed through some monetary transaction to receive the payment from the end-user.

How this transaction is conducted is up to the script developer - for example, I could set up a fully automatic web-based pay-and-license system, or I could simply use a manual system based on e-mail.

APID does not dictate how this transaction should be implemented - this part of the system is left open, and for the script developer to fill in.

Once I receive payment from the end-user I will then use the data I received via the URL to generate a license file, which I then e-mail to the end-user.

The end-user uses the license file to convert the time-limited demo of my script into a fully functional version.

The next time the beg dialog appears, he clicks Import License File... and imports the license file I sent. From then on FlontFipper.compiled.js will not show any more beg dialogs, and also will not lapse.

If you want to try these things out: go to

www.rorohiko.com/flontfipper.html

(all lowercase): you can see the transcript of this podcast with additional screenshots, and you can download the source code, the compiled script, and a license file to test things out.

I'll now dive a bit more into the details of the compilation. After I drag-drop my script onto the script compiler, I get to see a main dialog named APID Standalone Script Compiler.

Here I must define how the compiled script will behave. The most important field is a 'Component ID' string. It has quite a few subfields separated by semicolons and commas and is most easily edited by clicking the 'Edit...' button next to it.

Clicking the Edit... button shows the 'Component ID' dialog. In the 'Component ID' Dialog, I need to define a number of parameters

- a 'component name' which I set to be FlontFipper - this is the name that will appear in various dialogs
- a 'compilation password' which will be used in the encryption of the script - in this sample, I made te password guessWhat
- a copyright string

These first three strings are mainly about identifying your compiled script.

In this same dialog there are also a number of fields to restrict the use of your compiled script - there is a minimum APID version (which I set to 1.0.44 or 1.044 - the current version at the time of this podcast - the InDesignScriptCompiler will drop the leading zeroes automatically and display 1.44).

There is also a hard cut-off date for demo versions. Because I don't want to use such a date in this demo, I've set the three date-subfields (year, month, day) to zero.

The most interesting fields for me are the number of 'actual use' days and number of demo days.

#Demo days represents a number of calendar days since the first use of your script by the end user; 30 days here would mean 'about a month'.

#Actual Use Days represent days of actual use; these days are not necessarily consecutive. This is provided to cater for situations with 'infrequent use' - people that only test the script every so often, and might leave long gaps between uses; you could leave #Demo days set to -1, and #Actual Use Days for example set to 5 and the user would be able to try things out for 5 non-consecutive days.

You can use either, or both fields to limit how long a demo version of your script should be usable.

In this sample, I've set the actual use days to 20, and the calendar days to 30 - the demo will time out after 20 days of actual use, or 30 calendar days, whichever comes first.

There is also a checkbox Free which I'd only use if I don't want to sell my script - for example if I just want to give encrypted versions away without divulging my source code. This checkbox is deselected in my example - I want to sell my script.

On the main screen, I must also enter some messages and a License URL.

The URL is set to link to a web page on my web server, and it passes 4 parameters as part of the URL: the serial number of InDesign, the name of the script (as defined in the Component ID dialog), the system identifier (a unique identifier for the computer requesting a license), and a license level (which will be a letter 'D' or 'R' depending on whether the end-user already has purchased a proper license for APIDKernel or not) - the strings ^1, ^2, ^3, and ^4 are placeholders that are replaced by real data when the URL is needed.

There are also two messages that can appear in the beg dialog. In these messages, ^1 is a placeholder for the compiled script name (as defined in the component ID earlier on), and ^2 is a placeholder for the remaining number of demo days.

Once I click Compile for APIDKernel two things will happen: a compiled version of my script will be created (FlontFipper.compiled.js), and my original source code will also be prefixed with a few comment lines that store my compilation parameters - this to avoid having to re-enter the same stuff next time around.

When a user installs FlontFipper.compiled.js in one of the proper script locations for Adobe InDesign, the script will behave mostly like any other script, except for the beg dialog. Clicking Get License... on the beg dialog makes the end-users' system browser connect to the URL embedded into the compiled script.

Once I receive payment from the end-user, I use a command-line tool that is part of the APIDToolkit to generate a license file for him. Mac and Windows versions of this tool are provided. If so desired, this command-line tool can be embedded into a web server setup - this is how Rorohiko does automated software sales, for example of our Sudoku Generator.

Note that the command line tool is not provided with the demo version of APIDToolkit - you need to purchase the APIDToolkit to get access to this generator - a license for the APIDToolkit costs US$149.00.

For simple set-ups you can also create 'generic' license files which are not linked to any particular system ID or InDesign serial number. The demo license file for this podcast is such a generic license file. It will enable FlontFipper.compiled.js on any copy of InDesign.

Costwise, the APIDKernel runtime for your end-user is not free - there is a one-time cost of US$25 or less per seat.

Note: there are multi-seat APIDKernel bundles available. These should only be used for a single end-user company. For example, you're not meant to purchase a 100-seat APIDKernel and then break these up and sell individual seats to different end-user companies.

Another point worth noting is that APIDKernel licenses are linked to a particular serial number of InDesign, as well as to a unique system identifier. That means that if your customer is using both InDesign CS2 and CS3 on the same computer, he'll need to purchase two licenses for APIDKernel.

Depending on how you set things up, there can be two payments to be made by a first-time end-user. First of all, the end-user needs a license for the APIDKernel, and second, he needs a license for your script.

It is up to you, as a script developer, to decide how to handle the cost for APIDKernel.

You can bundle a pre-purchased coupon code for a license for the kernel when you sell your script, or you can ask your customer to license the kernel directly from Rorohiko. That means you can choose to either have a single monetary transaction between the end-user and you, or there can be two separate transactions
- between the end-user and you for your script
- between the end-user and Rorohiko for APIDKernel

Since version 1.0.44, there is also an option to provide your end-user with a single, combined license file which contains both a license for APIDKernel and a license for your script - contact APIDlicenses@rorohiko.com for more info; for this system to be available to you, you need to register with Rorohiko as an APID developer.

Lastly, if you expect to sell more than a few thousand copies of your script, you should probably consider our APIE/APIR combo - these are two other family members of the Active Page Items family. APIR is a runtime which is very similar to APIDKernel, except that it is free for end-users - there is no cost to the end-user for the APIR runtime.

To generate solutions that can use the free APIR instead of APIDKernel, you'd need to license APIE - which is a more expensive version of APIDToolkit, and can create solutions that work with the free APIR instead of the non-free APIDKernel.