Dr. Sofia Athenikos, PhD2

(PhD, Information Science × PhD, Philosophy)

~/tagline

Versatile senior software engineer with 13+ years of shipping high-impact systems — often end-to-end, often as the sole owning engineer. Currently having fun building AI-driven applications.

──02/STATS
0+
Years of Experience
0
Companies
0
Domains
0+
Projects Owned
0+
Internal Teams Onboarded
0+
Serviceable Daily Jobs
──03/OWNER

Systems I've built end-to-end.

nCountRRetail Data Pipeline & Analytics Suite

RR Donnelley · 2025

THE PROBLEM

Retail customer journey data and transaction data came from two distinct external sources, with no association clues between them beyond dates. Both arrived as files via SFTP — one file per journey, one per transaction — with no pipeline to ingest, clean, or route them to a usable destination (Snowflake, S3, Kafka). Data analysts and scientists were blocked from deriving the insights advertisers and marketers needed: how ad exposure affects customer dwell time, journey patterns, and conversion.

IMPACT

Enabled data analysts and scientists to evaluate ad effectiveness across exposure, dwell, and conversion — unlocking a new analytics capability for the business and firmly positioning it for the highly-anticipated launch of a new flagship product.

STACK

ScalaPythonKafkaSnowflakeS3SFTPSQL

Reactive PlatformEvent-Driven Workflow Orchestration & Execution Service

Vericast → RR Donnelley · 2023-2025

THE PROBLEM

The legacy batch-processing system executed jobs at predefined times, regardless of when input data actually arrived. This introduced significant latency between data availability and data processing — jobs sat idle waiting for their scheduled windows, even when the data they needed to act upon was already present. With 3,000+ daily jobs running across internal engineering teams, this latency compounded into a system-wide bottleneck.

IMPACT

Designed to service 3,000+ daily jobs across internal engineering teams and to eliminate the schedule-driven latency inherent in the legacy system. The decoupled, event-driven architecture replaced rigid batch windows with on-demand reactivity — processing jobs as soon as their input data arrived rather than at fixed times. The platform reached full implementation, testing, and validation in collaboration with the first prospective client team, and was ready for full-scale production rollout.

STACK

JavaPythonKafkaGrafanaGraphiteOpenTelemetry

HDFS Image AnalyzerHDFS Storage Analysis & Archival Recommendation Tool

Vericast · 2023

THE PROBLEM

HDFS had accumulated roughly 8 million files — a massive, ever-growing storage footprint with significant cost implications. There was no automated way to identify candidates for archival or deletion, so the Platform team faced an arduous manual burden whenever storage cleanup became necessary. The scale alone (millions of items, deeply nested directory trees) made manual inspection impractical.

IMPACT

Deployed and used in production by the Platform team to manage HDFS storage at scale (~8M items). Eliminated the arduous manual effort previously required to identify archival candidates, and gave the team a flexible, policy-driven tool to keep storage costs in check as the HDFS footprint continued to grow.

STACK

JavaHDFS

Dark Traffic Proxy (DTP)Production Traffic Shadowing Service

Twitter · 2022-2023

THE PROBLEM

Twitter's engineering organization was undertaking a company-wide initiative to migrate services off an aging, proprietary architecture onto modern, industry-standard tools and frameworks. A critical piece of the legacy stack was Diffy Proxy — a long-standing service that engineering teams relied on to shadow production traffic to test destinations, allowing them to validate code changes against real-world traffic patterns before deploying to production. Diffy Proxy worked, but it was built on the old architecture and was incompatible with the modernization effort. A replacement was needed: one that preserved the same essential functionality while fitting cleanly into Twitter's modern infrastructure.

IMPACT

The successful completion of the DTP project — with 100+ internal engineering teams onboarded and migrated from Diffy Proxy without production disruption — was recognized as a major milestone in Twitter's modernization initiative. DTP was officially designated as the recommended traffic-shadowing service for all engineering teams at Twitter.

STACK

ScalaTypeScriptReactFinagleFinatra

Entitlement Data MigrationHierarchical Restructuring & Two-Way Sync

Bank of America Merrill Lynch · 2017-2018

THE PROBLEM

The Equity Linked Technology (ELT) group at BoA Merrill Lynch — thousands of traders, engineers, and executives — managed its role-based access (RBA) entitlement data as a flat, unstructured list. Two long-standing problems made the system painful and risky:

  • No hierarchy. Granting or auditing access required clicking through entitlements one item at a time. A trader requesting access to 100 trading books had to select each one individually; supervisors approving the request had to review each one individually; auditors verifying compliance had to check each one individually. The process was arduous and error-prone at every step.
  • One-way sync only. Entitlement data flowed in only one direction between the UI/service and the database, creating opportunities for the two sources of truth to drift out of sync — a serious risk for compliance-sensitive data subject to periodic audits.

Entitlement data at a major financial institution is non-public, business-critical, and regulated. It had to be 100% correct, 100% consistent, and 100% auditable. The problem had been long-standing — ELT had previously been unable to find an engineer to take it on. And critically, the change required was irreversible: once the migration ran, there was no rollback. Zero margin of error.

IMPACT

The migration delivered three outcomes that solved a long-standing problem ELT had been unable to address:

  • Existing entitlement data successfully migrated to the new hierarchical structure with zero data loss.
  • Two-way sync pipeline between UI/service and database running daily, eliminating the inconsistency risk of the legacy one-way design.
  • A significantly streamlined UI presenting entitlements as a resource tree, allowing users to request access by clicking the highest applicable node — replacing individual-item clicking with hierarchical selection.

The production release was completed without incident and recognized by ELT leadership as a successful outcome on a project that had previously resisted resolution.

STACK

JavaJAXSpring BootiBATISOracle DBToad

Content ExtractionCore Platform Refactoring & Service Implementation

Flipboard · 2015-2016

THE PROBLEM

Content extraction is the heart of Flipboard. The application lets users specify topics of interest and delivers relevant content — news, blogs, social media, RSS feeds — by scraping publisher sites, extracting the content, and reformatting it for delivery across web, mobile, and other devices. Without reliable content extraction, there is no Flipboard.

The existing content extraction codebase was production-critical but had accumulated significant quality debt. Methods routinely spanned hundreds of lines, were difficult to understand, and were hard to safely extend or modify — a serious problem for a system this central to the product.

IMPACT

The contributions were twofold:

  • Production functionality. The features I implemented — schema/validation infrastructure, PDF extraction, config and management services, and the restructured extraction pipelines — were deployed and used in production, serving the core content delivery that defines the Flipboard product.
  • Codebase durability. The refactored and rewritten modules significantly improved the maintainability and extensibility of the most central engineering surface at Flipboard, making it easier for other engineers to safely work in code that had previously resisted modification.

STACK

JavaSpringTomcatS3XML/DOM
──04/TIMELINE

13+ years across 9 companies.

2024 - 2026

RR Donnelley

Principal Software Engineer · Full Time · Remote

Continued as a core member of the 3-member R&D (NERD) team (with the addition of a support engineer) after Vericast's Valassis subsidiary was acquired by RR Donnelley, reporting to the (same) VP of Engineering and delivering high-priority, high-visibility projects. Worked cross-functionally to shape the product roadmap and accelerate project delivery.

Key contributions

  • Continued sole engineering ownership of the Reactive Platform — the event-driven workflow orchestration and execution service originally started at Vericast — through full implementation, testing, validation with the first prospective client team, and readiness for production rollout.
  • Added comprehensive observability to the Reactive Platform using Grafana, Graphite, and OpenTelemetry, improving system reliability and reducing debugging time.
  • Conducted R&D for the nCountR retail analytics project as the sole engineer — from proof-of-concept through full implementation of the end-to-end data pipeline, the journey-transaction correlation algorithm, and the customer journey visualization.

Stack

JavaScalaPythonSQLKafkaSnowflakeS3GrafanaGraphiteOpenTelemetry

2023 - 2024

Vericast

Principal Software Engineer · Full Time · Remote

Joined as one of two principal-level engineers on the core R&D (NERD) team at Valassis Digital Corp, a Vericast subsidiary, reporting directly to the VP of Engineering. Conducted high-priority, high-visibility R&D — establishing proof-of-concepts that guided the product roadmap, then driving the systems and services to full implementation.

Key contributions

  • Developed the HDFS Image Analyzer command-line utility as the sole engineer — a tool that scans the HDFS image (~8M files), applies user-defined filtering predicates, and generates archival recommendation reports. Deployed and used in production by the Platform team to manage storage at scale, contributing to storage cost savings while eliminating arduous manual cleanup effort.
  • Engineered the Reactive Platform as the sole engineer — an event-driven workflow orchestration and execution service comprising three core services (Event Generation, Batching, Processing), an Overseer service, client utilities, and a configuration UI. Designed to replace the legacy batch-processing system and reduce data processing latency for the 3,000+ daily jobs run by internal engineering teams.

Stack

JavaKafkaHDFS

2019 - 2023

Twitter

Sr. Software Engineer · Full Time · Remote

Member of the Capacity Correctness & Reliability team within Core Infra | Performance Engineering. Team focus: Implementing critical services for Twitter service reliability and efficiency, development velocity, and capacity planning.

Key contributions

  • Refactored the "Perfy" codebase (Scala) — Twitter's performance/load/regression testing service — and added a broad set of new capabilities: health-check client for target service instances, encryption of API connections, dynamic target-instance lookup via platform specs, custom target mapping, metric collection and storage across test types, a metrics dashboard, email/Phabricator notifications with test details and results, and real-time metric push.
  • Refactored the "RedCurve" codebase (Scala) and added significantly enhanced dashboards — including an upgraded taskrun dashboard, a service-data dashboard, a historical dashboard for visualizing service stats over time, and the ability to extract and store specs for target instances and taskruns.
  • Refined the "PagerDuck" codebase (Go) and implemented new features.
  • Created shared libraries by refactoring common code out of Perfy and RedCurve — reducing duplication and improving maintainability.
  • Built and rolled out Dark Traffic Proxy (DTP) — a new production traffic shadowing service (Scala) that replaced the legacy Diffy Proxy. Solo-designed and implemented the full service, then onboarded 100+ internal engineering teams to production. DTP was officially designated as Twitter's recommended traffic-shadowing service.

Team recognition:"Improving Our Focus" Award at Twitter OneTeam 2020.

Stack

ScalaGoTypeScriptReactFinagleFinatraAuroraMesosGrafana

2018 - 2019

Morgan Stanley

Associate (Senior Software Engineer) · Full Time · New York, NY

Member of the Quantitative and Structured Products group within Institutional Securities Technology | ESTAR. Team focus: Automated pricing system for equity derivative trading.

Key contributions

  • Implemented a modified pricing-context re-allocation algorithm — improving how the pricing system reuses and reassigns computational context across pricing runs.
  • Devised a pricing-result diagnostics data exchange format (XML/JSON) and implemented the data mapping classes that convert pricing-result objects into diagnostics objects serializable/deserializable to and from the format.
  • Implemented data mapping classes for converting pricing-result objects for consumption by an internal third-party application, with JSON serialization/deserialization.
  • Built a KDB client service — with associated data mapper, logger, reader, writer, and factory classes — for performing read/write operations against KDB (the time-series database used widely in finance).
  • Implemented REST API endpoints for storing and retrieving serialized diagnostics, pricing, and testing data in KDB, along with DAO classes that process pricing and test results and call those endpoints.
  • Implemented a parameter override mechanism for repricing, plus numerous additional features that enhanced the automated pricing system overall.

Stack

JavaSpring BootKDBJSON/XMLSwagger

2017 - 2018

Bank of America Merrill Lynch

Consultant (Senior Software Engineer) · Contract · New York, NY

Sole-engineer engagement for the Equity Linked Technology (ELT) group at BoA Merrill Lynch. Took on a long-standing, high-stakes, irreversible data restructuring and migration project that the group had previously been unable to staff.

Key contributions

  • Restructured ELT's role-based access (RBA) entitlement data — used by thousands of traders, engineers, and executives — from a flat, unstructured list into a hierarchical model, while building the bidirectional sync pipeline that eliminated long-standing inconsistency risk between the UI/service and the database.
  • Devised and executed a 50+ step, two-day production migration with zero data loss and no production disruption. The change was irreversible — once executed, no rollback was possible — and required step-by-step validation throughout.
  • Delivered the full project in 10 months — two months ahead of the originally projected 12-month timeline — while incorporating additional enhancements beyond the original scope at the request of the primary business stakeholder. Recognized by ELT executives as a successful outcome on a project that had previously resisted resolution.

Stack

JavaJAXSpring BootiBATISOracle DBToad

2016 - 2017

Bloomberg

Consultant (Senior Software Engineer) · Contract · New York, NY

Member of the Data Intelligence team within the R&D Consumer Media Services group. Worked on Bloomberg's entity tagging, personalization, and recommendation systems — the backend systems that power content tagging and personalized content delivery for consumer-facing media products.

Key contributions

  • Revised and extended the data ingestion pipeline to ingest data from new data sources, broadening the input surface available to downstream tagging and recommendation components.
  • Revised and extended the tagging, personalization, and recommendation systems to accommodate new tag types, enabling richer classification of ingested content.
  • Refactored the tagging system code to use new terminologies — accommodating terminological evolution and extension while improving clarity and consistency across the codebase.

Stack

JavaSpringSpring BootTomcatHadoopHBase

2015 - 2016

Flipboard

Software Engineer · Full Time · New York, NY

Member of the Content Extraction team within the Platform Engineering group — working on the system that defines Flipboard's product: extracting content from numerous sources and reformatting it for delivery across web, mobile, and other devices.

Key contributions

  • Solo-built or rewrote multiple core components of the content extraction stack — including the central extraction microservice (~57 classes), the core extraction package (~81 classes), a new schema and validation microservice (~39 classes), and a rewritten config and management microservice (~46 classes) with a Git-based file management system newly implemented using JGit.
  • Implemented supporting tooling for the extraction system: a PDF content handler that converts PDF sources to well-formed HTML; a concurrent asynchronous test client for realistic load and behavior validation; and cross-version result validators that catch regressions between releases.
  • Drove deep refactoring across the content extraction codebase proactively — not as an assignment, but because the existing code was getting in the way of safely extending the most critical part of the product.
  • In numbers (across the entire work): 1,200+ commits, 250K+ lines added, 154K+ lines deleted.

Stack

JavaSpringTomcatS3XML/DOM

2013 - 2015

IPsoft

R&D Engineer · Full Time · New York, NY

Member of the core R&D team (~10 engineers) reporting directly to the CEO, building Amelia — an AI/NLP-based virtual customer service agent and the company's emerging flagship product. Worked primarily on the engineering infrastructure that powered and supported Amelia's conversational behavior and reasoning: the conversation workflow execution module and the semantic knowledge management module.

Key contributions

  • Developed and maintained the BPM module — the engine responsible for capturing, executing, and maintaining real-time conversational issue-resolution workflows between Amelia and the user. Each conversation was represented as a graph, with the module managing its creation, execution, and management lifecycle. Collaborated with one other engineer.
  • Devised and implemented algorithms supporting conversational workflow execution and management, including answer-type handling, multiple-choice handling, and graph merging (for consolidating structurally similar conversation flows).
  • Also contributed to the semantic knowledge management module — the framework used to create and operate against a semantic knowledge base storing entities, relationships, and inference rules. Authored inference rules over the knowledge base, improving the system's ability to derive and infer facts during conversations.
  • Collaborated directly with the CEO on R&D direction — including studying and adapting research papers (e.g., on graph merging) into concrete enhancements implemented for and integrated into Amelia.

Stack

JavaSpringTomcatActiviti (BPM)SQLHibernateCassandraHazelcastActiveMQ

2012 - 2013

Amazon

Software Development Engineer · Full Time · New York, NY

Founding member of the New York team built to launch the Amazon Marketing Services (AMS) Advertising Platform — a new self-service product geared toward Amazon sellers (not buyers), enabling them to create brand pages, post advertising messages, and analyze their marketing campaigns. The platform was co-developed by an existing San Francisco team and the newly created NY team; later, ownership was divided such that SF focused on Amazon Pages while NY focused on Amazon Posts and Amazon Analytics.

Key contributions

  • Contributed across the full stack, with Java/Spring-based backend services, AWS-backed storage (S3, RDS), and JavaScript/jQuery on the frontend — typical of e-commerce self-service platforms of the period. Built web services according to multi-tier architecture for Amazon Posts and Amazon Analytics, supporting the successful launch of the AMS platform.
  • Restructured and refactored the codebase proactively according to the principle of clean separation of responsibilities, improving understandability, maintainability, and extensibility.
  • Later belonged to a 3-engineer API focus subgroup within the NY team, designing, implementing, and maintaining public-facing REST APIs.

Stack

JavaSpringRESTAWS (S3, RDS)SQLJavaScriptjQuery
──05/AI-DRIVEN WORK

My AI-related work experience started at IPsoft in 2013, with Amelia — an early AI/NLP-based customer service agent. Since then my work experience has been focused on backend development for distributed systems and data pipelines in various domains. Inspired by the recent advancements in generative AI, I am now returning to AI as the primary focus.

Currently Building
Active Development

AI ReceptionistVoice-AI Agent for Domain-Specific Use Cases

Started May 2026 · In active development

A voice-AI agent prototype that handles incoming phone calls for appointment scheduling, customer support, and lead capture. Currently targeted at dental practices as the initial niche.

The system integrates a Vapi-based voice assistant (Deepgram Nova 2 transcription, GPT-4o cluster model, 11Labs voice synthesis) with a Node.js/Express/TypeScript backend exposing three custom tool endpoints (check_availability, book_appointment, capture_lead) plus call transfer. The backend is deployed on Railway with Supabase as the persistent data store.

End-to-end conversation flows have been verified through real test calls against the deployed system. The project is currently in active iteration as the deployment is being stabilized for full demo.

Tools Used

ClaudeGPTVapiDeepgram11LabsNgrokRailwaySupabaseTypeScriptNode.jsExpress.js
Past AI Project

Multi-Model Deep Learning Chatbot2017

A chatbot application integrating multiple deep learning models I trained myself, each providing a distinct capability accessible through natural conversation:

  • Image Recognition
  • Sentiment Detection
  • Time Series Prediction
  • Sentence Generation

Each capability was demonstrated through dialog with the chatbot — users could converse with the agent and trigger any of the underlying DL models depending on their request. A personal project built on personal time, outside of the day-job scope, but demonstrated at Bank of America Merrill Lynch's company-wide tech innovation fair, where the exhibit drew the largest crowd as the audience favorite.

Foundations & Continuous Learning

Building Gen AI Java Apps with LangChain4J — Beyond the Basics

LinkedIn Learning · Completed Feb 2026

Advanced techniques for building production-grade generative AI applications using LangChain4J in the Java ecosystem.

Building Gen AI Java Apps with LangChain4J — Introduction

LinkedIn Learning · Completed Feb 2026

Foundational concepts and patterns for integrating LLMs into Java applications through the LangChain4J framework.

Deep Learning Nanodegree

Udacity · Completed Oct 2017

Hands-on coverage of neural networks, CNNs, RNNs, GANs, and deep reinforcement learning — with multiple project deliverables. The chatbot project mentioned above was built by applying techniques learned here.

Machine Learning Specialization (First 4 Courses)

Coursera · Taught by Andrew Ng · Completed Apr 2016

Foundational coverage of supervised and unsupervised machine learning (e.g., clustering, classification, regression).

Plus additional ML coursework across multiple platforms.

Tools Used

Python (Scikit-Learn, NumPy, SciPy, Pandas)TensorFlowKerasFloydHubDatoMATLABOctaveWekaDL4J
──06/TOOLKIT
AI / Gen AI / ML / DL
LLM (Claude, GPT)LangChain4JVapiDeepgram11LabsAgentic ArchitecturePrompt EngineeringPython (Scikit-Learn, NumPy, SciPy, Pandas)TensorFlowKerasDL4JFloydHubDatoMATLABOctaveWeka
Programming Languages
JavaScalaPythonGoTypeScriptKotlinRustRC++CSolidityLispScheme
Backend & Microservices
REST APISpringSpring BootSpring DataSpring CloudNetflix OSS (Eureka, Hystrix, Ribbon, Zuul)TomcatServletJSPJava Handler
Data Platforms & Streaming
KafkaSparkHadoopMapReduceHDFSHBaseElasticSearch
Data Storage, ORM, & Caching
SQLPostgreSQLMySQLOracleKDBCassandraMongoDBSnowflakeSupabaseiBATISJDBCJPAHibernateQuillToadHazelcastH2Redis
Architecture & Distributed Systems
Event-Driven ArchitectureAgentic ArchitectureSOAMulti-Tier Server-Client ArchitectureMicroservicesRESTMessage-Driven SystemsWorkflow OrchestrationBPM (Activiti)
Messaging & Streaming
KafkaActiveMQRabbitMQ
Cloud & Containerization
AWS (S3, RDS)DockerKubernetes
Observability & Monitoring
GrafanaGraphiteOpenTelemetry
Data Extraction & Processing
XMLDOMSAXJAXBJSONJacksonGson
Frontend & Full-Stack Web Development
HTMLCSSJavaScriptTypeScriptReactjQueryAJAXBootstrapKnockoutJSExtJSSwaggerNode.jsExpress.jsZodNgrokRailwayNext.jsTailwind CSSFramer MotionLucide ReactJetBrains MonoGeistReact Hook FormFormspreeVercel
Testing & QA
JUnitScalaTestMockitoPowerMockEasyMock
Build & CI/CD
MavenGradleAntIvyPantsBazelCargoJenkinsHudsonGitLab
Version Control & Code Review
GitGitHubGitLabBitBucketSVNPerforcePhabricatorReviewBoard
Statistics & Data Science
RRStudioSPSS
Semantic Web & Rule Engines
OWLRDFSPARQLProtégéSesamePowerLoomsDrools
──07/ABOUT

I'm a backend-focused senior engineer with 13+ years of building high-impact systems across diverse domains such as e-commerce, AI/NLP, media, finance, social media, and advertising. My core strengths are backend development — APIs, microservices, distributed systems — and codebase refactoring and re-architecting. Across my roles I've often built things end-to-end as the sole owning engineer.

Before my software engineering career, I completed PhDs in Philosophy and Information Science, with research in information extraction / retrieval / visualization, conceptual modeling, semantic web, and knowledge engineering, which continues to inform how I think about systems today.

How I work: autonomously, with minimum guidance, driving projects from inception to completion. I enjoy building things from scratch, going from 0 to 1. In doing so, I also think deeply and carefully about implications, edge cases, and architectural choices. I write code that is not only functional but clean and well-structured — I take pride in the quality of what I produce. I quickly learn new tools, domains, and technologies as needed, and I currently apply this same approach to building AI-driven applications.

>Seeking Senior / Staff / Principal Software Engineer roles —

>AI-driven, with maximum personal space and latitude to

>ship, own, and deliver.

──08/CONTACT
Send a Message

* required