Foreign Language Text Recognition for the Layman

May 1, 2009

It’s not often that I looked at a foreign text and cannot determine what language it is.  We can all tell French, German, or Spanish, but what about the different Cyrillic languages, or Far Eastern ones like Thai and Vietnamese?  I recently came across something that might help us.

The US Army Field Manual 34-54 on Battlefield Technical Intelligence is freely available on the Web.  Here is a description of this manual as taken from the manual itself:

chp_9_402This field manual provides guidance to commanders and staffs of military intelligence (MI) and other units responsible for technical intelligence (TECHINT) or having an association with TECHINT. It provides general guidance and identifies the tactics, techniques, and procedures (TTP) used in the collection, exploitation, and dissemination of TECHINT in satisfying the warfighter’s requirements.

Appendix G, entitled “Foreign Language Text Recognition”, is a concise and educational lesson on how to recognize a foreign language in unknown text.

Quoting from Appendix G:

When TECHINT personnel are able to correctly identify foreign languages used in documents or equipment, it has two immediate benefits. First, it helps identify the equipment or type of document and where or who is using it. Second, it ensures that TECHINT personnel request the correct linguistic support.

This appendix contains language identification hints that will enable TECHINT personnel to quickly identify some of the many languages used in documents, on equipment plates, and on other materiel. TECHINT personnel can speed up the entire battlefield TECHINT process by following the guidance herein.

For those of us who are, um, a little rusty and have forgotten the difference between a cedilla and a circumflex, this appendix will set you right.  Gone are the excuses for not recognizing a foreign language when you see one. 🙂

Advertisements

Easy Indexed Access to Declassified NSA/CSS Documents

April 27, 2009

The NSA/CSS Declassification Initiatives web page contains the following easily overlooked paragraph:

“An index of 4,923 entries containing approximately 1.3 million pages of previously declassified documents, which have been released to NARA is provided. The documents are from the pre-World War I period through the end of World War II.”

The links refers to a fascinating listing of cryptologic documents declassified by NSA/CSS in Project OPENDOOR (1996) and released to the U.S. National Archives and Record Administration (NARA) in Washington, D.C. In the very long, unsorted list one could easily overlook such gems as:


A Superb Cryptanalytical Guide to the Enigma Cipher Machine

April 24, 2009

Over the years I’ve come across many articles explaining how the Enigma cipher machine works. All too often I would feel, as a cryptanalyst, that many of the articles glossed over important features or handled them poorly.

Well, if you’re looking for an in-depth explanation of the Enigma, complete with a lucid mechanical description and mathematical underpinnings you can really sink your teeth into (and understand!) look no further. Check out Erik Vestergaard’s superb explanation of the Enigma’s mechanical, operational, and mathematical aspects. A Danish high school mathematics teacher, Vestergaard took his class on a study tour to London in 2007, and one of their stops included Bletchley Park. This site is a wonderful compilation of their experience in Bletchley, complete with mouth-watering, clear descriptions of how the Enigma works and how it was broken. Here’s a list of topics covered: Read the rest of this entry »


NSA/CSS declassified documents

April 7, 2009

nsa-css-header
The National Security Agency (NSA) and the Central Security Service (CSS) periodically release declassified documents or indexes to these documents to the public.  This is all part of the Freedom of Information Act (FOIA) which allows for the full or partial disclosure of previously unreleased information and documents controlled by the United States Government.

If cryptology and cryptanalysis are your cup of tea, just browse over to NSA’s Declassification Initiatives web page and dive in.   You’ll need a few hours to do this justice, so plan on returning a few times.

Here are some juicy finds: Read the rest of this entry »


Trumpet Your Successes

March 24, 2009

Some while back I read an interesting article by Mary K. Pratt on the ComputerWorld Web site entitled “Five things you should always tell your boss“.  As a software manager myself for more than a decade, I agree with the list of items in the article.  Item number 5, entitled “Your Successes”, triggered a memory. Read the rest of this entry »


My “Best Technical Article Opening Paragraph” Award

March 16, 2009

We’ve all read great articles written by talented authors.  You may have even written some articles yourself.  We are well aware of the holy grail of article writing: capturing the reader’s interesting in the opening paragraph.  It’s the difference between a forgettable article and one that will live forever in your readers’ minds. Read the rest of this entry »


Cryptographically-Challenged Cryptographers

March 5, 2009

In a recent posting on the Cipher Mysteries blog, Nick Pelling provides a list of contemporary cipher challenges he collected from the Web.  One of those listed is by one Andrew Fergus, who posted his own supposedly unbreakable cipher which apparently uses a pseudo-random number generator for encryption.  As Mr. Fergus proclaims, “I have spent some time developing a cipher, which I genuinely believe is unbreakable”.

What left me in a state of disbelief was the following gem found on Mr. Fergus’s site: Read the rest of this entry »