~ / startup analyses / InnoData Leak Analysis: 6,648 Files from a NYSE-Listed BPO Company


InnoData Leak Analysis: 6,648 Files from a NYSE-Listed BPO Company

A forensic analysis of a 1.1 GB data dump from InnoData Inc. (INOD, NYSE), a publicly traded business process outsourcing company headquartered in Hackensack, New Jersey. The dump contains 6 RAR archives holding 6,648 files across documents, spreadsheets, PDFs, images, text files, and nested archives. The material spans 2006–2012 and reveals two distinct categories: InnoData's own internal operational data from its patent abstracting business, and highly sensitive client source documents from Russian government agencies, Vietnamese nuclear authorities, and Turkmenistan's interior ministry.

Key finding: InnoData appears to have been processing classified or sensitive government documents from the Russian FSB (Federal Security Service), the Vietnamese Atomic Energy Agency (VAEA), and Turkmenistan's Ministry of Internal Affairs alongside its commercial patent abstracting work for Thomson Reuters / Clarivate Analytics. All of this material ended up in a single extractable archive dump.



2. 1. Archive Overview & Structure

The dump consists of 6 RAR archives totaling 1.1 GB compressed:

ArchiveSizeFilesContent
InnodataRawDOCfiles.rar177 MB2,834.doc, .docx, .dot, .rtf
InnodataRawXCELfiles.rar122 MB2,074.xls, .xlsx, .ppt, .rtf, .pps
InnodataRawPDFiles.rar273 MB770.pdf
InnodataRawTXTfiles.rar26 MB500.txt, .att, .html, .pptx, .dat, .ics, .mdb
InnodataRawIMGfiles.rar164 MB470.jpg, .tif, .png, .gif, .xml, .bmp
InnodataInternalArchives.rar369 MB363Nested .rar and .zip archives (12 password-protected)
Total1.1 GB7,011

All archives use the naming convention RAWDATASTORAGE[ID]_IR_[TYPE], suggesting structured exports from an internal storage system. The storage IDs (1449856, 1451057, 44410, 671922, 76288) appear to be database record identifiers.

File Groups by Naming Prefix

Within the archives, files cluster into consistent naming groups that map to distinct data domains:

PrefixDomainApprox. Files
APL_VINATOM_SECTION9_RUIACRAW_VAEAVietnamese nuclear program1,722
SCAND_RUFSBIACRAW_ATTSATRussian fisheries / customs1,439
SCAND_IACRUFSB_RAWIRIDFSB maritime enforcement docs851
ISO_RAW_IACDERW_RPRTInnoData internal operations433
IAC_VAEARAW_RU_ENLSVAEA / IAEA PDFs396
SCAND_FSBRU_RAW_IAC_AMOSRussia-Japan maritime protocols277
VOSTOK_FSB_SCANDFSB Far East vessel reports & images195
SPB_HOCSAURUN_RUIAC_TMPDCRRussian banking / finance166
MMIBS_VOCORDRU_*VOCORD surveillance / IT infrastructure160+
INNOD_REPORTLOG / ISOGEN_DERWTEMPInnoData QC reports & patent abstracts172
HOCSAURUN_IACTEMPRussian personal/business scanned docs69
IAC_PHARMARAW_DERW_TEMPPharma patent training materials53
INDTAFEFSBLOGENTRY*FSB intercepted satellite comms8
InmarsatINDTattlogprocdInmarsat vessel messages to FSB159

3. 2. InnoData Internal Operations

The dump contains substantial internal operational data from InnoData's patent abstracting business, which services the Derwent World Patents Index (DWPI) for Thomson Reuters (now Clarivate Analytics). The work was performed at InnoData's IAC (InnoData Abstraction Center) facility, apparently based in the Philippines (Filipino employee names throughout).

Employee Commission & Payroll Data (~200+ spreadsheets)

Biweekly commission sheets titled "COMPUTATION OF COMMISSION FOR DDF ABSTRACTORS" reveal InnoData's compensation structure:

  • Per-record pricing: $35 per LONG document, lower rates for SHORT
  • Named employees with daily document counts, quality scores, and quality incentives
  • Example: editor Jeniffer Buendia processed 64 LONG documents = $2,240 commission per period

Quality Assurance Scorecards (~200+ spreadsheets)

Monthly QA tracking per employee with columns for: assessed, passed, score (% accuracy), penalty levels (P0, P1, P1a), and disciplinary remarks ("For Written Warning 1", "For Written Warning 2", "For Dismissal", "Currently below 90%").

Patent QC Error Reports (73 HTML files)

Weekly QC sampling reports generated for the vendor "INNOD" covering July 2008 through December 2009. Comprehensive statistics:

MetricValue
Total patent reviews4,650
Unique patents3,333
Active InnoData editors131 (codes 7001–7352)
QC editors9 (TFULC, JARSC, SBAIL, etc.)
Average score95.1%
Perfect score rate77.3%
Technology areas14 (Chem, Pharma-Bio, Computing, Semiconductors, etc.)

Patent Country Distribution

CountryCountShare
CN (China)2,15746.4%
US (United States)64213.8%
WO (WIPO/PCT)62713.5%
JP (Japan)3968.5%
KR (South Korea)3196.9%
DE (Germany)1894.1%
EP (European Patent)1733.7%
Others1473.2%

Most Error-Prone Abstract Sections

SectionFail Rate
Activity (AC)8.8%
Detailed Description (DD)7.1%
Tech Focus (TF)6.6%
Title (TI)5.1%
Novelty (NO)3.5%
Advantage (AD)3.3%
Description of Drawings (DoD)3.3%
Use (US)2.8%

The dominant error category was translation/language errors (60.7% of QC comments), particularly from Chinese patent translation. Example errors: "dispelling" instead of "treating", "selectomycin slag" as a nonsensical chemical translation.

Work Product & Training Materials

  • 99 completed patent abstracts in Derwent's proprietary tagged format (<TI1>, <NOV>, <USE>, <ADV>, <DDSC>)
  • Derwent Abstracting Rules, Edition 5.3 (136-page internal rulebook from Thomson Reuters)
  • Pharmaceutical terminology training slides, approved abbreviation lists, chemistry acronym references
  • Disease-activity keyword mapping tables (74 pages)
  • Software configuration files for the CPI data entry application, including URLs (online.cpi-outsourcing.com)
  • Data entry executable: DILDF02-DataEntry.exe with validation tools (DDF-VAL.EXE)

Pharmaceutical Patent Abstracting Pipeline (from nested archives)

97 ZIP files in the nested archives contain the actual production pipeline for Derwent pharmaceutical abstracts:

  • Structured abstracts of pharmaceutical journal articles with drug classification tags (/OC/PH = organic compound/pharmaceutical)
  • Biological activity data (cytotoxicity IC50 values, antimalarial activity)
  • Indexer workflow files (.HLD) with batch names, indexer IDs, subject areas
  • Quality monitoring spreadsheets: "2009 Derwent Quality Abstractor Monitoring File.xls", "2009 Derwent Quality Assessment Score.xls", "2009 Derwent Quality Error Analysis (Country - Technology).xls"

4. 3. The FSB Database: 1,396 Maritime Enforcement Records

The most significant single file in the dump is WORKGROUPBASE_RUFSB.mdb, a 5.8 MB Microsoft Access database. It is a complete operational database of the Russian Federal Security Service (FSB) Border Guard Service, tracking maritime law enforcement incidents in Russia's Pacific maritime zones from 2006 to 2011.

Database Structure: 6 Tables

TableRowsContent
Main Vessels Table1,396Core incident records with 31 columns
Patrol Units98FSB patrol ships and inspection units
Countries217ISO country code lookup
Fishing Subzones15Maritime zones (Karaginskaya, Kuril, Okhotsk, etc.)
Border Directorates7Regional FSB commands
Service Units11Customs and inspection services

Main Table Fields

Each of the 1,396 incident records contains:

  • Date, time, and fishing subzone
  • Vessel type, name, flag state, home port
  • Ship owner name and address, captain name
  • Description of violation (free-text memo field)
  • Species and quantity of seized marine bioresources
  • Booleans: weapons used, vessel detained, vessel confiscated, criminal proceedings opened
  • Environmental law, border regime, and frontier regime violation flags
  • Primary measures taken, administrative penalties, criminal proceedings details

Aggregate Statistics

MetricCount
Total incidents1,396
Vessels detained834 (59.7%)
Weapons used by border guards31
Criminal proceedings initiated197
Administrative penalties imposed742
Vessels confiscated40
Environmental law violations749
Border regime violations236
Incidents with seized bioresources594
Foreign vessel incidents388

Vessel Flags

FlagIncidentsNotes
Russia993Domestic violators
Cambodia232Flag of convenience — 171 had Russian captains
Sierra Leone36Flag of convenience
Belize35Flag of convenience
Panama31Flag of convenience
Japan14Legitimate foreign vessels
South Korea13
Georgia10Flag of convenience
North Korea (DPRK)6

The most striking pattern is the massive use of Cambodian flags of convenience. Of 232 Cambodia-flagged vessels, at least 171 had Russian captains, and most were home-ported in Phnom Penh — a landlocked city's port registry, an obvious flag-of-convenience indicator. These were Russian-operated fishing boats registered under Cambodia's lax maritime registry to evade Russian fishing regulations.

Top Seized Species

SpeciesIncidents
Crab (various species)341
Herring87
Sea urchin65
Pollock (mintai)58
Salmon species20
Squid18
Cod18
Shrimp16
Flounder16

Weapons Use (31 Incidents)

FSB border guards used weapons in 31 documented incidents, including AK-74 assault rifles, AK-230 naval guns, AK-630 rotary cannons, and aircraft-mounted guns. Typical scenarios included warning shots to stop fleeing vessels and shots "to hit" targeting antenna equipment.

The "Kissin Maru 31" Incident (August 16, 2006)

The database contains the complete operational record of an internationally notorious incident where FSB border guards fatally shot a Japanese fisherman:

  • Japanese fishing vessel "Kissin Maru 31", captain Sakasito Noboru, home port Hanasaki
  • Discovered in the Soviet Strait near the disputed Kuril Islands
  • Refused to stop; after warning shots, fire was opened to hit the target, fatally wounding one crew member
  • Seized: 1,069 kg prickly crab, 30 kg horsehair crab, 10 kg octopus, 25 crab traps
  • Captain convicted under Article 322 (illegal border crossing) and Article 256 (illegal fishing)
  • Fined 250,000 rubles + 245,854 rubles in damages; vessel confiscated

5. 4. FSB Border Guard Communications

Eight text files (~732 messages total) contain intercepted or logged maritime communications to and from FSB border guard offices in the Russian Far East, covering February–May 2012. Additionally, 159 .att files contain Inmarsat satellite messages from vessels to FSB shore stations.

Communication Types

FSB Border Guard Operational Messages

Messages between patrol vessels and shore commands at:

  • Yuzhno-Sakhalinsk, Prospekt Pobedy 63-a: SPU BO FSB RF (Sakhalin Border Guard)
  • Vladivostok, Svetlanskaya 67: Primorsky Border Guard Directorate
  • NPKC: Navigation Patrol Coordination Center

Content includes daily status reports (fuel levels, weather), vessel inspection results, border crossing notifications using standardized KS1-KS15 forms, and intelligence reports.

French Intelligence Ship Surveillance

The border patrol vessel "Shkiper Gek" tracked the French Navy signals intelligence ship DUPUY DE LOME (A759) in the Sea of Japan in May 2012:

"REPORTING RESULT OF AERIAL RECONNAISSANCE — DISCOVERED 1 TARGET PRESUMABLY INTELLIGENCE SHIP, HULL NUMBER A759, NAME DUPUY DE LOME, FLAG FRANCE, AT COORDINATES 46 25.7 N, 138 40.9 E, HEADING 30, SPEED 12 KNOTS" (03.05.2012)

"AT 18:15 06.05 TARGET WAS AT 44 11 N 136 17 E, COURSE 230, SPEED 18 KNOTS, CURRENTLY RADAR CONTACT LOST" (06.05.2012)

Encrypted Communications

The patrol vessel "Manchzhur" transmitted multiple encrypted numerical messages (5-digit code groups) — a classic military/intelligence communications format:

5555 6392 0995 2149 3944 8632 6280 3063 9811 2239...

These appear alongside unencrypted position reports and vessel inspection results.

Inmarsat Satellite Messages (159 files)

Vessel-to-shore messages containing border crossing notifications with vessel identity, call sign, captain name, crew count, cargo declarations (species and weights in kg), and GPS coordinates. All addressed to FSB border guard authorities.

Fishing Operations Data

  • SRTM "SUNTAR": Red king crab catches of 29,751–56,519 kg per trip
  • Transport vessels moving frozen fish between Russian ports and Busan (South Korea), Dalian and Qingdao (China)
  • Vessel "Proliv Longa": carrying 10,172,086 kg of frozen fish product

6. 5. Russian Far East Fisheries & Customs Data

The largest single file group (1,439 spreadsheets) contains Russian government fisheries enforcement and customs export data from the Far Eastern Customs Administration (DVTU) and FSB Border Guard Service, covering January–May 2012.

Data Categories

Customs Export Declarations (~642 files)

HTML files saved as .xls, each containing 150–700 individual customs records with 20 fields: vessel name, destination country, HS commodity code, product name (with Latin species name), gross/net weight, registration date, and customs office code.

Russia-Korea Bilateral Notifications (~193 files)

Formal bilingual NOTIFICATION/REPLY/INFORMATION documents issued under the Russia-Korea bilateral fisheries agreement by MIFAFF (Korean Ministry of Food, Agriculture, Forestry and Fisheries). Each contains vessel IMO number, radio call sign, shipowner, detailed cargo with weights and HS codes, and fishing license references.

Fishing Quota Data

Files tracking actual catch utilization against allocated quotas by company and region. Example: ZAO "Moneron" — pollock quota in Kamchatka-Kuril subzone: 147.78 tonnes allocated, 147.78 caught (100% utilized).

Aggregate Export Statistics (Extrapolated from 30-File Sample)

SpeciesEst. Net Weight (tonnes)Records
Pollock (Theragra chalcogramma)~903,0005,059
Fish meal~128,000749
Roe (ikra)~98,0001,061
Herring~58,000288
Cod (Gadus macrocephalus)~56,0001,309
Snow crab (Chionoecetes opilio)~17,000722
Halibut~8,000808
Flounder~6,800119
Shrimp (Pandalus)~6,000125
Blue king crab (Paralithodes platypus)~4,300174
Red snow crab (Chionoecetes japonicus)~3,00056
Salmon species~2,40047
Total sampled~1.37M tonnes

Top Transport Vessels by Volume

VesselEst. Tonnes
SIRIUS-1~88,000
Crystal Africa~52,000
Tambov~45,000
Ice Breeze~39,000
Crystal Arctica~33,000

Top Shipowners (from bilateral notifications)

  • Okeanrybflot JSC: 12 notifications
  • Dalreefer Shipping Company Ltd: 11
  • Trans Wind Fleet: 10
  • Peta Chemical Company: 9
  • Rybolovetskiy Kolkhoz Vostok-1: 7

Russia-Japan Maritime Border Security (277 PDFs)

Trilingual documents (Russian, Japanese, Korean) covering bilateral border security data exchange between the Japan Coast Guard (1st Regional HQ, Otaru, Hokkaido) and the FSB Border Guard Administration for Primorsky Krai, pursuant to a bilateral protocol dated December 18, 2000.

Contents include:

  • Quarterly vessel port call data for Hokkaido ports (Wakkanai, Monbetsu, Abashiri, Hanasaki, Otaru, Hakodate)
  • Russian and Russian-crewed foreign-flagged vessels entering Japanese ports
  • Cargo: primarily crab, sea urchin, frozen shrimp, pollack roe, fishmeal
  • Evidence of flag-of-convenience usage (Cambodian, Sierra Leonean flags on Russian-crewed ships)
  • Korean MIFAFF "REPLY" forms documenting Russian vessel cargo entering Busan
  • Engineering drawings for border surveillance equipment (TNSK-P "Rubezh"-745 system)

7. 6. Vietnamese Nuclear Program Documents

Over 2,100 files across DOC, PDF, PPTX, XLS, and image formats document Vietnam's nuclear power development program and its cooperation with the IAEA, Russia, Japan, South Korea, France, and Canada. The material dates from 2010–2013.

Document Types

Government Correspondence (1,722 DOC/DOCX files)

Internal documents from Vietnam's Ministry of Science and Technology and its Department of Atomic Energy (Cuc Nang luong Nguyen tu). Primarily in Vietnamese with some English. Includes:

  • Official submissions to the Prime Minister regarding nuclear power development
  • Weekly activity reports from the Department of Atomic Energy
  • Requests for international consultants for nuclear safety assessments
  • Personnel nominations for IAEA training trips (Japan, Russia, Vienna)
  • Translations of World Nuclear News articles
  • Policy drafts on state management capacity for nuclear energy
  • Budget figures: 50M VND for research tasks, $34M USD and 1.99B JPY for consulting contracts

IAEA Cooperation Documents (396 PDFs)

  • Training course nomination forms with candidate CVs, passport details, medical certificates
  • Expert mission confirmations for workshops on radioactive waste management
  • IAEA publications: "Evaluation of the Status of National Nuclear Infrastructure Development"
  • Phone directories for Ministry of Science and Technology units
  • ~43% of PDFs are scanned images requiring OCR

Strategic Presentations (16 PPTX files)

Presentations by VAEA officials and international partners revealing Vietnam's nuclear roadmap:

Planned FacilityTechnologyCapacityTarget Date
Ninh Thuan 1 (Phuoc Dinh)Russia (VVER)2 x 1,000 MWe2020–2021
Ninh Thuan 2 (Vinh Hai)Japan2 x 1,000 MWe2021–2022
6 additional sitesTBDVariousBy 2030

Total projected nuclear capacity by 2030: 10,700 MWe (10.1% of total power generation). HR development budget: 3,000 billion VND (~$150M USD). Training targets: 2,400 engineers + 350 Masters/PhDs for nuclear power by 2020.

Note: Vietnam ultimately cancelled its nuclear power plans in November 2016, citing economic concerns. The Ninh Thuan projects were never built.

IAEA Expert Missions (117 XLS/PPT files)

Detailed planning for 13+ expert missions to Vietnam under IAEA project VIE/4/015, covering NPP site selection (seismic/geotechnical hazards at Ninh Thuan), reactor physics training (VVER reactors), NPP project financing, environmental radiation monitoring, and radioactive waste management. Each entry includes host organization, venue, dates, objectives, and contact persons with full phone/fax/email details.

Images (53 files)

Official Vietnamese government documents, IAEA conference photographs, and scanned passport pages of Vietnamese nuclear officials.


8. 7. Russian Banking & Financial Records

~200 files across DOC, XLS, TXT, and image formats contain Russian financial, banking, and commercial documents centered on St. Petersburg and Moscow-based organizations.

Bank Regulatory Data

  • Central Bank of Russia Form 0409101 (Trial Balance) for AKB "LINK-bank" (OAO), Moscow, August 2010 — complete account-level data with opening balances, debits, credits, closing balances across hundreds of second-order accounts
  • Bank inspection report for SPORTINVESTBANK LLC — organizational structure, management bodies, asset/liability analysis, income/expenses, regulatory capital ratios (56 million rubles), compliance findings
  • Corporate profile for OAO "Energomashbank" — SWIFT code (ENEB RU 2P), BIK (044030754), INN (7831000066), general banking license #52, management names
  • Internal credit lending regulations for "ZATO-bank" (with Grozny branch)

Commercial Documents

  • Commercial invoices with full banking details (settlement accounts, correspondent accounts, BIK codes)
  • Construction material cost estimates from OOO "INREMSTROY" (Moscow)
  • Banking service fee comparisons across 8+ Russian banks (Prime Finance, Energomashbank, Bank Saint-Petersburg, GLOBEX, LOKO, PSKB, Promsvyazbank, Baltinvestbank)
  • Real estate contracts in St. Petersburg between "A.T. Invest" LLC and "Kavkazpromstroybank" OJSC
  • ARIN real estate brokerage contracts

Scanned Documents (69 images)

Personnel questionnaires ("ANKETA") with passport-style photos, recommendation letters on Investbank letterhead with stamps/seals, commercial invoices, and property records from Moscow's Central Technical Inventory Bureau (BTI).


9. 8. Turkmenistan Traffic Enforcement Systems

44 spreadsheets document a traffic enforcement camera system deployment for the Turkmenistan Ministry of Internal Affairs (MVD), implemented by MMI Business Services FZC (a UAE Free Zone company) using VOCORD (a Russian traffic camera/ALPR vendor) technology.

Key Findings

System Performance Issues

Technical trouble reports signed by the Head of Communications Department of Turkmenistan MVD (Oraz Taganov) document 14 critical issues:

  • 68% false violation rate — 15,722 of 23,057 violations flagged as false
  • Impossible speed readings (truck recorded at 275 km/h)
  • Street server instability and replication failures
  • Mobile unit malfunctions
  • Operator interface deficiencies

IT Infrastructure

  • IBM/Lenovo hardware warranty tracking for Central Bank of Turkmenistan (#00945359) and MMI Business (#00003612)
  • Product divisions: ITS Services, High Volume On Power, Tape Systems, Storage Systems
  • IBM BladeCenter management at 192.168.70.125
  • IBM Power 710 servers running AIX 6.1 at Turkmenistan State Commercial Bank "Presidentbank"

Business Relationships

  • Lenovo service contracts with SLA penalty structures and labor fee schedules
  • IBM PartnerWorld registration (MMI as IBM partner)
  • Named personnel: Iskander Fridun-Zade (MMI Director of Business Development), Dmitry Zavarikin (VOCORD Director)

10. 9. VOCORD Surveillance Infrastructure

Across multiple archive groups, the dump contains operational data from VOCORD, a Russian company that manufactures traffic cameras, ALPR (Automatic License Plate Recognition) systems, and facial recognition technology.

Configuration Files

  • VOCORD Tahion surveillance system configs: gate server addresses (10.1.1.1:43888), JPEG2000 decoder settings, multi-camera display layouts (3200x2400 virtual resolution)
  • Road camera detection zone definitions in XML: lane boundaries, counting lines, speed measurement zones mapped to pixel coordinates
  • Hardware decoder, deinterlacing, and rendering settings (VMR7/VMR9)

Operational Data

  • Installation/reinstallation guides with specific firmware versions (Rev2.1_X4_10_26.11.09.bit, flash-1.0.01674.img.VMx4)
  • VOCORD Traffic Server, Tahion Standard, NetScale, Archive component configurations
  • License key registration procedures via KeyRegistrationManager
  • Overhead nighttime road surveillance images from deployed cameras

Business Context

VOCORD systems were deployed in partnership with MMI Business Services for the Turkmenistan government. The Cisco IPS (Intrusion Prevention System) WebEx meeting invitations suggest MMI had a cybersecurity relationship with Cisco as well (attendee: a.jafarov@mmibs.com, organizer: ralbach@cisco.com).


11. 10. Nested Archives & Additional Material

The 6th RAR file (369 MB) contains 363 nested .rar/.zip archives organized into 4 groups, including 12 password-protected archives that could not be opened.

Ship Tracking Archives (141 files, 109 MB)

IAC_RUFSB_SHIP_TRACAMOSAT — Russian Coast Guard operational data:

  • Ship inspection reports organized by date (named after vessels: Subaru, Knevichi, Professor Kizevetter, Deyrin Maru 5, etc.)
  • 96+ vessel layout diagram images (GMI schemes)
  • Transas Navi-Harbour 4.30 VTS/ODU navigation software licenses for the Far Eastern Coast Guard in Vladivostok, including license keys and PRIMAR user permits
  • Scanned permits, official orders, and economic zone tracking data

Pharmaceutical Patent Archives (97 files, 147 MB)

INNOD_PHARMA_DERW_TRACK — InnoData's Derwent production pipeline (covered in Section 2).

VOCORD / IT Infrastructure Archives (99 files, 88 MB)

VOCORD_MMIBS_RAWDATA — mixed content:

  • VOCORD Tahion surveillance system application logs
  • IBM BladeCenter AIX system snapshots and FFDC diagnostic data
  • Fiber channel switch firmware traces
  • Sysinternals Process Monitor captures (.PML)
  • DiskInternals Uneraser 3.91 with license key (pirated software)
  • Russian fiction e-books (.fb2) — personal files accidentally included

Vietnamese Nuclear Archives (26 files, 43 MB)

VINATOM_RAW_VAEA_RUIAC — additional VAEA material:

  • Technical documentation covering 8 reactor types (BWR, VVER, RBMK, CANDU, ARG, Shippingport, Magnox, PWR)
  • Official applications for establishing the VAEA electronic information portal
  • Mixed-in Russian medical text about growth hormone disorders (wrong project)

12. 11. Implications & Analysis

What Is InnoData?

InnoData Inc. (INOD, NYSE) is a publicly traded company headquartered in Hackensack, New Jersey, with operations in the Philippines, India, Sri Lanka, and other countries. Founded in 1988, it provides business process outsourcing (BPO), data engineering, and AI training data services. As of 2025, it has ~5,000 employees and ~$100M annual revenue. Its stock has risen significantly on the AI training data wave.

The Data Security Question

The most significant finding is that InnoData appears to have stored — in a single extractable archive dump — its own internal operational data alongside:

  • Classified Russian FSB border guard communications, including surveillance of foreign military vessels and encrypted intelligence messages
  • An FSB operational database with 1,396 law enforcement incident records including weapons use and a documented fatality
  • Vietnamese government nuclear program records with named officials, passport scans, and budget allocations
  • Russian Central Bank regulatory filings with account-level financial data
  • Turkmenistan Ministry of Internal Affairs traffic enforcement system data

This raises serious questions about data handling, compartmentalization, and security practices at a company that was simultaneously processing sensitive government documents from multiple nations and commercial patent data for Thomson Reuters.

The "IAC" and "RUIAC" Connection

The naming conventions suggest an entity called IAC (Information-Analytical Center) or RUIAC (Russian IAC) was involved in collecting or channeling this data. The prefix patterns — SCAND_RUFSBIACRAW, IAC_VAEARAW, SPB_HOCSAURUN_RUIAC — suggest IAC was an intermediary or a project name within InnoData for processing Russian and Vietnamese government contracts.

Business Intelligence Value

For Fisheries & Maritime Intelligence

The fisheries data is the most commercially valuable subset: ~1.37 million tonnes of documented Russian Far East seafood exports with vessel names, shipowners, HS commodity codes, trade routes, and volumes. Combined with the FSB enforcement database showing flag-of-convenience patterns and IUU (Illegal, Unreported, Unregulated) fishing incidents, this dataset would be relevant for fisheries compliance monitoring, sanctions enforcement, and seafood supply chain transparency.

For Nuclear Non-Proliferation

The Vietnamese nuclear program documents — while now historically moot since Vietnam cancelled its NPP plans in 2016 — contain detailed contact information for senior officials, IAEA project codes, infrastructure timelines, and international cooperation agreements that would have been of significant intelligence value at the time.

For Competitive Intelligence on InnoData

The internal operational data reveals InnoData's pricing structure ($35/document for patent abstracts), quality metrics (95.1% average QC score, 77.3% perfect rate), staffing model (131 editors tracked over 17 months), and disciplinary practices. This is granular competitive intelligence on a publicly traded company's core BPO operation.

Timeline

PeriodData
2006–2011FSB maritime enforcement database
2008–2009InnoData patent QC reports
2009–2012Russian banking/financial documents
2010–2012Vietnamese nuclear program correspondence
2011–2012VOCORD/Turkmenistan traffic enforcement
Jan–May 2012Russian fisheries customs data and FSB comms

Open Questions

  1. How did InnoData come to process FSB border guard operational communications and a law enforcement database?
  2. What is the "IAC" (Information-Analytical Center) referenced in the naming conventions, and what was its relationship to InnoData?
  3. Why were Russian government security documents, Vietnamese nuclear program records, and commercial patent abstracting data stored together in one archive?
  4. What do the 12 password-protected nested archives in the ship tracking group contain?
  5. The Access database contains the FSB's own internal record of the Kissin Maru 31 fatal shooting — how did this operational record end up in a BPO company's data dump?