Desktop Chinese-English lookup dictionary

I wrote this Java-based Chinese-English lookup dictionary (CED) over the weekend. It started off as a project to learn and improve my poor Chinese language skills but the more I got into it, the more I realised that I needed a tool to help me along while I try to improve my vocabulary.

So I started googling around and found this amazing dictionary file called CEDict which gave me a list of all traditional (Big5) and simplified (GB) Chinese characters, their Hanyu Pinyin pronounciation as well as a simple description of the word or phrase. This was heaven-sent of course. This cool site, Mandarin Tools, also provided a great online dictonary tool as well as an offline Java desktop application (DimSum) which I used briefly in learning more Chinese words. However after a while the app got a bit draggy as it was pretty bulky. Don’t get me wrong — DimSum is an incredibly cool tool with tons of features but the only two things I really needed was a dictionary translation of the word or phrase as well as a hint of the pronounciation. The other stuff was just dead weight to me as I needed something fast and to the point.

So I wrote my own tool. I downloaded CEDict from Mandarin Tools, the Mandarin sounds from Chinese Lessons and wrote a simple Java desktop application that uses these data. The result is CED (Chinese-English Dictionary — not too creative I admit) .

The premise is simple. You run CED. While you are reading a Chinese document or website you find these words that you don’t recognize. You’ll likely not want to drag your thousand-paged Chinese-English dictionary and start comparing brushstrokes or go rushing to Mandarin Tools, cut and paste and wait for the answer. You just want to know what those words means so that you can get on with your reading.

So you select the words and copy it (or press Ctrl-C). You switch over to CED and you will have the word you have just copied described to you, with a brief explanation of what it means. You will also find a list of words that are related to the one you have selected.

There is even a ‘say‘ button at the bottom of the list, which you can click and hear the Mandarin Hanyu Pinyin pronounciation of the word or phrase you have selected. Simple!

I’m releasing this under GPL and have registered a site at Java.net. In the meantime you can also find the installer binaries for Windows here. It should work with OSX or Linux or any platform that runs Java but I don’t have any of those so I can’t say for sure. Once the Java.net site is up I’ll upload the source files and jar files so that you can try it out yourself.

Have fun!

*NOTE*

The java.net site is up at http://ced.dev.java.net. You can download the Windows installer as well as a zipped file of the jars from this site. Drop me a note to tell me how you like it!

Session expiry in Rails

Something that stumped me for quite a few days was the fact that I couldn’t log into my application after some times. It was truly irritating because after migrating to Mongrel and having 3 Mongrel clusters running on an Apache it was blazing fast but I couldn’t log in after some time!

Checking the error logs I found this seems to be the problem.

Filter chain halted as [login_required] returned false

What’s this then? After more research I suspected that the reason why my login throws me out is because the session is no longer valid. Following this clue, I installed LiveHTTPHeaders, a Firefox plug-in that snoops the HTTP headers that are transferred to and from the browser. This gave me more information but raised some very puzzling questions:

HTTP/1.x 200 OK
Date: Fri, 15 Sep 2006 02:11:59 GMT
Status: 200 OK
Cache-Control: no-cache
Content-Type: text/html; charset=UTF-8
Set-Cookie: sam_production_session_id=bdbae24176c4bfbec1be1109c2beee8c; path=/; expires=Thu, 14 Sep 2006 17:25:22 GMT
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 667
Connection: close

Apparently the session cookie that was set was expired as I was trying to log in! Of course the session was invalid and I was logged out! Curiouser and curiouser.

What was happening? Running this on Webrick works perfectly so I was really stumped. What was wrong with Apache/Mongrel that this doesn’t work? Then I remembered that I set the expiry of the session to 1 hour in my application.rb

class ApplicationController  1.hour.from_now
end

I tried to login repeatedly and looked at the LiveHTTPHeaders again. Surprise! The cookie expiry date doesn’t change! So what’s wrong?

Well after more research, apparently the main problem is because I was running Webrick on a development environment while my Apache/Mongrel was running on production. In development mode, ApplicationController reloads every time a request is made and session is called for each request. In a production environment, session is only called once and therefore the expiry is effectively fixed. The reason why I could log in initially was because during my setup I restarted the cluster repeatedly.

My solution? I removed session expiry, such that the session doesn’t expire. This means that as long as I don’t log out or close the browser, the session remains valid. So what are the other options? Unfortunately as I found out, Rails doesn’t really allow any easy means of doing dynamic session expiry. This entry in the rubyonrails wiki provides good information on the possible alternatives. Another alternative is suggested by Coda Hale in his blog.

If you have anything drop me a note as well.

Getting information from an EMV chip card with Java

In case you’re not sure what I’m talking about here — it’s a piece of plastic card with a little chip that is embedded on your credit or debit or ATM card. In UK this is generally known as Chip and Pin. If you’re from US or some countries where chip migration hasn’t occurred yet it might be a bit harder to imagine what it is so here’s a picture.

Almost all payment chip cards (except in France) uses a worldwide standard called EMV (a rather meaningless acroynm from Europay-Mastercard-Visa, the founding members of the collaboration that created this standard though, ironically this US-created standard is used mostly outside of US).

Retrieving information from an EMV compliant chip is not an inherently difficult task. I personally believe it’s daunting to many programmers because the mechanism of talking to a smart card is something quite different from the higher-level programming we’re used to, which is one of the reasons why Thomas and I created Jaccal.

In this example I’ll break down a sample Jaccal script line-by-line in its raw APDU format to show exactly how information is retrieved from the chip card. I’ll show the script twice — the first one in pure APDU, and the second, a higher-level Jaccal script using Jaccal APIs to do the job for you. In this example, I’ll be showing you snippets from the official EMV4.1 specification, which I’ll mention in passing but not in details. You can get the EMV specifications from EMVCo directly, it’s freely downloadable.

What do you need? Firstly you’ll need an EMV chip card. Most probably this will be a credit card or a debit card with a chip on it. Then you’ll need a smart card reader. The one I’m using is a GemPC Twin from Gemplus (now Gemalto), but almost any card reader that supports PCSC will do. Card readers are mostly plug-and-play though if it needs drivers, it should come with the package. Windows supports smart cards by default so you shouldn’t need anything special. You’ll also need Jaccal of course. Download it from Sourceforge.

That’s all! Maybe a little bit of patience as well. I’m assuming that you’ll have a bit of knowledge of smart cards and some ISO 7816 knowledge, but don’t worry if you don’t, just drop me a note in the comments and I’ll add it in.

Connect to the chip card

To start any chip card interaction, you must first start a connection to the card. The card then responds with an ATR (Answer To Reset). ATRs can be used to determine the card technology used and the manufacturer that produced it.

atr = open();
prints(atr);

In Jaccal script, the command ‘prints()’ displays a string in the output. If you’re using the Anubis Script Editor, packaged together with Jaccal, this will be displayed in a separate tab. If you’re using Jaccal from the command line, the output is the console.

Output

Power on
[ATR] 3B 66 00 FF 4A 43 4F 50 32 30

A quick check with the smart card list maintained by Ludovic Rousseau shows that the card I used (UOB Platinum Visa Card) is likely to an IBM JCOP (JavaCard Open Platform) 30 chip card.

Select the PSE directory

This example uses the PSE (Payment System Environment) selection method to query the chip card and determine which application in the EMV card. Not all EMV cards support this application selection method since this method is optional in the EMV standard. MChip applications (from Mastercard) mandatorily supports it while VIS (from Visa) leaves it optional. This means that if you try this with a Mastercard this will always work but it’s a bet with a Visa card. The PSE begins with a DDF given the name ‘1PAY.SYS.DDF01’

prints("[Step 1] Select 1PAY.SYS.DDF01 to get the PSE directory");
cmd = new ApduCmd("00A404000E315041592E5359532E4444463031");
card_response = execute(cmd);
prints(card_response);

We start with creating an ApduCmd object initialized with this strange alphanumeric string. Looking at it carefully we can split the string into two parts, the first is the command, and the second is the data. The first part is a ISO7816 select command (00 A4 04 00 0E), while the rest is the hexamdecimal representation of the ASCII character “1PAY.SYS.DDF01”.

Raw output

The result shows a successful selection of PSE, which means that the PSE exists in the chip card. For the unintiated, the status word (SW) returned (90 00) indicates success. The output is encoded in a simple TLV (tag-length-value) format.

[Step 1] Select 1PAY.SYS.DDF01 to get the PSE directory
[R] 6F 1A 84 0E 31 50 41 59 2E 53 59 53 2E 44 44 46 30 31 A5 08 88 01 01 5F 2D 02 65 6E
[SW] 90 00

To interpret the response output at all, you need to look at the EMV specifications, Book 1 section 11.3.4 on the structure of the response message upon selecting the PSE. This is a snippet of a table from the specification.

From the above you can tell that the DF name starts at 5th byte of the response (31) and is 0E length long, which is (31 50 41 59 2E 53 59 53 2E 44 44 46 30 31) or translated to ASCII, 1PAY.SYS.DDF01. From the above you can also tell the SFI (short file identifier) of the first PSE record to be 01 and the support language to be ‘en’ (English). This is the interpreted output:

DF Name : 1PAY.SYS.DDF01
SFI : 1
Languages supported : en

Get the PSE record

Next we need to find out where to start getting the PSE data from.

SFI = NumUtil.hex2String((byte)((1 < < 3) | 4));

From the READ RECORD command reference control parameter specification below (from the EMV4.1 specification, book 1 section 11.2.2 table 39), we know that the last 3 bits are 100 when P1 is a record number, and the last 5 bits are the SFI. This means P1 is 00001100 and the code above does that by doing a left shift on the SFI bits by 3 positions and ANDs it with a 4 (binary 100). 1100 is 0C in hexadecimal.

prints("[Step 2] Send READ RECORD with 0 to find out where the record is");
read = new ApduCmd("00B2010C00");
card_response = execute(read);
prints(card_response);
byte_size = NumUtil.hex2String(card_response.getStatusWord().getSw2());

Now that we know where the record is, we need to read the PSE record. Unfortunately reading a record from a smart card is not as direct as from a file system. The READ RECORD command needs to know how many bytes to read, but we don’t know that at this point in time. So we just to send a 0 to the record location. The chip card will reply saying that 0 is not the correct number of bytes and gives us the number of bytes to read! (It’s true, I’m not making this up. )

Raw output

[Step 2] Send READ RECORD with 0 to find out where the record is
[SW] 6C 1C

This output shows the the first byte (status word 1) to be “6C” which means a code from the chip card meaning “Wrong length” while the second byte (status word 2) is “1C” which is the size of the record.

prints("[Step 3] Send READ RECORD with 1C to get the PSE data");
read = new ApduCmd("00B2010C1C");
card_response = execute(read);
prints(card_response);

Now that we know how many bytes to get, we can then confidently send READ RECORD to the record location to get 1C bytes.

Raw output

[Step 3] Send READ RECORD with 1C to get the PSE data
[R] 70 1A 61 18 4F 07 A0 00 00 00 03 10 10 50 0A 56 49 53 41 43 52 45 44 49 54 87 01 01
[SW] 90 00

This time the chip card returns us the real PSE data, which we need to interpret again. Looking at the table below we can see that there is only 1 application data file (ADF).

The application name or application ID (AID) is the one with the tag 4F, with 7 bytes i.e. A0 00 00 00 03 10 10. The label for this application is the one that starts with tag 50, with 10 (hexadecimal 0A) bytes i.e. 56 49 53 41 43 52 45 44 49 54, and this translates to VISACREDIT. Lastly the priority for this application is 1, which is kind of redundant since it’s the only application that’s available. This is the interpreted output:

Application name (AID): A0 00 00 00 03 10 10
Application label: VISACREDIT
Application priority: 1

All these hard work only tells us what the EMV application is. We have not really come to getting the actual data that is on the card yet! Moving on, we need to select the application found from the PSE and try to get data from it.

Select the application

Now that we know where the application is, go ahead and select it. You should get a satisfactory status word of “90 00” with a bunch of response that basically echos what you have just selected.

prints("[Step 4] Now that we know the AID, select the application");
cmd = new ApduCmd("00A4040007A0000000031010");
card_response = execute(cmd);
prints(card_response);

Raw output

[R] 6F 25 84 07 A0 00 00 00 03 10 10 A5 1A 50 0A 56 49 53 41 43 52 45 44 49 54 87 01 01 5F 2D 08 65 6E 7A 68 6A 61 6B 6F
[SW] 90 00

The next step after selecting the application is to send a “GET PROCESSING OPTIONS” (GPO) command to retrieve the Application Interchange Profile (AIP) and the Application File Locator(AFL). To send a GPO you’ll need the Processing Data Objects List (PDOL) which is the data field of the GPO command. The PDOL is part of the response from the selection of the application as described below.

The PDOL tag is “9F38” and the PDOL is an optional element. From the response you’ll see that there is no PDOL from the ICC. This is quite common.

Get the Application File Locator (AFL)

Moving on, we will send the GPO to the chip card ot get the AIP and AFL. We don’t really need the GPO response, if you already know where the data is, though.

prints("[Step 5] Send GET PROCESSING OPTIONS command");
cmd = new ApduCmd("80A80000028300");
card_response = execute(cmd);
prints(card_response);

The GPO command is “80 A8 00 00 02 83 00”. Since there is no PDOL, we will put the tag 83 with the size 00 only. Lc is the size of the data field, which is 2 bytes.

Raw Output

[Step 5] Send GET PROCESSING OPTIONS command
[R] 80 0E 7C 00 08 01 01 00 10 01 05 00 18 01 02 01
[SW] 90 00

The AIP consists of 2 bytes and indicates which features are supported by the chip card while the AFL indicates the location (SFI and range of records) of the files related to a given application. This is the juicy stuff, the data that you want out of the chip card. The AFL consists of groups of 4 bytes, each group indicating a range of records.

The AIP in this case is 7C 00 while the 3 groups of AFL are (08 01 01 00), (10 01 05 00) and (18 01 02 01).

These are the rules on how you can interpret a group of bytes in the AFL:

0 8 0 1 0 1 0 0
0000 1000 0000 0001 0000 0001 0000 0000

The five most significant bits of the first byte (08) indicate the SFI. The three least significant bits of the first byte is always set to zero. This means the SFI is 1.

The second byte (01) indicates the first (or only) record number to be read for that SFI. The record number is 1.

The third byte (01) indicates the last record number to be read for that SFI. Its value is either greater than or equal to the second byte. When the third byte is greater than the second byte, all the records ranging from the record number in the second byte to and including the record number in the third byte shall be read for that SFI. When the third byte is equal to the second byte, only the record number coded in the second byte shall be read for that SFI. Since the second and third bytes are the same, we will only read record number 1.

The fourth byte (00) indicates the number of records involved in offline data authentication starting with the record number coded in the second byte. The fourth byte may range from zero to the value of the third byte less the value of the second byte plus 1. There is no offline data authentication with the first group of 4 bytes.

1 0 0 1 0 5 0 0
0001 0000 0000 0001 0000 0101 0000 0000

SFI is 2, first record to read is 1, last record is 5 and there is no offline data authentication.

1 8 0 1 0 2 0 1
0001 1000 0000 0001 0000 0002 0000 0001

SFI is 3, first record to read is 1, last record is 2 and there is offline data authentication.

Get the record information!

Now that we know where the information is, let’s go get it. I will show you how to get the SFI 1 only, you can try the rest yourself.

prints("[Step 6] Send READ RECORD with 0 to find out where the record is");
read = new ApduCmd("00B2010C00");
card_response = execute(read);
prints(card_response);
byte_size = NumUtil.hex2String(card_response.getStatusWord().getSw2());
prints("[Step 7] Use READ RECORD with the given number of bytes to retrieve the data");
read = new ApduCmd("00B2010C" + byte_size);
card_response = execute(read);
prints(card_response);

data = new TLV(card_response.getData());

For SFI 1, you know the drill, the P1 is the record number, which is 1, while P2 is the SFI number which you can derive by left shifting 3 times and add 4 to it. This becomes 0C in hexadecimal.

First send a 00 as the Le to get the number of bytes to retrieve. After you know the number of bytes (I’m skipping the step where you inspect the raw output, since this is the same as above), you can use that as the Le to retrieve the number of bytes from the chip card.

Raw Output

[Step 6] Send READ RECORD with 0 to find out where the record is
[S] 00 B2 01 0C 00
[SW] 6C 4F
[Step 7] Use READ RECORD with the given number of bytes to retrieve the data
[S] 00 B2 01 0C 4F
[R] 70 4D 57 13 XX XX XX XX XX XX XX XX D0 70 42 01 20 00 00 96 00 00 0F 5F 20 1A 43 48 41 4E 47 20 53 41 55 20 53 48 45 4F 4E 47 20 20 20 20 20 20 20 20 20 20 9F 1F 18 32 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 39 36 30 30 30 30 30 30
[SW] 90 00

The response of the last read record command returns something interesting.The reason why you see all the XX’s is because that’s my credit card number. From the last part of the code, you see a new class called TLV. This class parses the card response and transforms it into a proper TLV that you can query for the information. Alternatively you can try to interpret the data above from the EMV4.1 Book 3 Annex A where all the data elements used for EMV is described. You will see that the tag “57” is the Track 2 equivalent data. This means that the the data here is the exact duplicate of the information inside the track 2 of the magnetic stripe of the same card. You can see that after all the XX’s there is a “D”. This separates the PAN (primary account number or the credit card number) and the expiry date of the card, which is is YYMM format. Here it shows that the card will expire in 0704 which is April 2007. The same data is actually found in SFI 3, but I won’t go through that with you here.

Looking further, you can see the tag “5F20” which is the tag for the Cardholder Name. The subsequent bytes are the hexadecimal representation of my name — “CHANG SAU SHEONG”. Note that there are spaces (20) after my name and that the size of the data field is 26 bytes. You guessed it — the cardholder name can have a maximum of only 26 characters.

Finally after you have gotten what you wanted, you need to close the connection to the card reader nicely

close();

The code here is pretty tedious with all the APDUs in byte format. What Jaccal has done is to put things nicely in Java classes and methods. An equivalent of those APDU commands can be something like this:

atr = open();
prints(atr);

prints("[Step 1] Select 1PAY.SYS.DDF01 to get the PSE directory");
cmd = new ISOSelect(ISOSelect.SELECT_AID, EMV4_1.AID_1PAY_SYS_DDF01);
card_response = execute(cmd);
prints(card_response);
SFI = NumUtil.hex2String((byte)((1 < < 3) | 4));

// try SFI 1 record 1
prints("[Step 2] Send READ RECORD with 0 to find out where the record is");
read = new EMVReadRecord(SFI, "01", "00");
card_response = execute(read);
prints(card_response);
byte_size = NumUtil.hex2String(card_response.getStatusWord().getSw2());

prints("[Step 3] Send READ RECORD with 1C to get the PSE data");
read = new EMVReadRecord(SFI, "01", byte_size);
card_response = execute(read);
prints(card_response);
// the AID is A0000000031010
prints("[Step 4] Now that we know the AID, select the application");

cmd = new ISOSelect(ISOSelect.SELECT_AID, "A0000000031010");
card_response = execute(cmd);
prints(card_response);
prints("[Step 5] Send GET PROCESSING OPTIONS command");

cmd = new EMVGetProcessingOptions();
card_response = execute(cmd);
prints(card_response);

// SFI for the first group of AFL is 0C

prints("[Step 6] Send READ RECORD with 0 to find out where the record is");
read = new EMVReadRecord("0C", "01", "00");
card_response = execute(read);
prints(card_response);
byte_size = NumUtil.hex2String(card_response.getStatusWord().getSw2());

prints("[Step 7] Use READ RECORD with the given number of bytes to retrieve the data");
read = new EMVReadRecord("0C", "01", byte_size);
card_response = execute(read);
prints(card_response);

data = new TLV(card_response.getData());

close();

What you should take note is that some of the interpretation is done manually to keep the script simple. In a real life situation you’ll probably want to automate things greatly, perhaps even a single method of an EMV class that goes like “getCardholderName()” and you will be able to get the name. Jaccal is open source — you’re welcome to create something like that.

Enjoy!

New release of Jaccal

Following a spurt of renewed interest in smart cards, I did a new release of Jaccal, mainly based on usage of Jaccal for retrieving data from an EMV chip card. This new version also included a trick I used in JSS in making built-in commands for Jaccal’s scripting engine, based on BeanShell scripts, which got rid of the ugly ‘anubis’ prefix for built-in commands.

I will be follow this up with some other blog entries on EMV as well. Stay tuned.