Resume of Mark Kerzner (mark.kerzner-at-shmsoft.com)
Hands-on Software Architect
Skills & Technologies
Skills: Distributed and grid computing, AWS (S3, EC2), Hadoop,
MapReduce, MapR, Hive, Pig, Sqoop, Flume, HBase, Cassandra,
high-performance multi-threaded applications.
Tools:
Java, Scala, R, NetBeans, IntelliJ, Eclipse, AJAX, C++, Search (Lucene), Drools, Python/NLTK, JavaScript,
Swing, J2EE,
JSP, Servlets, PHP, Messaging (JMS, Tibco), Web Services, JBoss,
Weblogic, .NET, C#, VB.NET, Visual C++, ASP.
Environment: Linux, Windows, MySQL, SQLServer, Oracle.
Business domains: eDiscovery, Legal, Energy, Trading.
Performed Hadoop/Cloud consulting projects for Cognizant, Intel, Deutsche Telecom,
T-Mobile, GHX healthcare and multiple startups. Total number of Hadoop clusters set up so far: 60+.
Creator of an eDiscovery/Enterprise Search solution, SHMcloud (Hadoop, Lucene, Solr, HBase, EC2, S3, and more).
Co-author of the Hadoop Illuminated open source book.
Organizer, presenter and hands-on trainer at Houston Hadoop Meetup.
Professional Experience
07/12-present
SHMsoft, Houston, TX
Software architect & CEO. Involved in a number of Big Data / Hadoop projects in eDiscovery, search, marketing, and training.
Latest clients included Cognizant for multiple projects, Intel for Hadoop training and POC projects
for Intel's clients, and Bank of America for Hadoop administration.
Some of the accomplishments were:
- Created three training courses: Hadoop for Developers, Hadoop for
Administrators, and HBase for Developers. The courses included general
overviews and hands-on exercises, also created VM's for that;
- Delivered the training to internal Intel groups and to Intel's
client. The course were usually followed by the Proof Of Concept
implementations in the Big Data areas covered by training;
- Delivered the training to other trainers who then delivered them further.
- Maintained Hadoop clusters for dev/staging/production.
- Created scripts to form EC2 clusters for training and for processing.
- Designed action monitoring system for a large cluster in the hundreds of nodes.
- Application performance optimization for a Cassandra cluster.
Technologies used on these projects: Java, Hadoop, Hive, Pig, HBase, R, Sqoop, Flume, Maven, Git.
Endorsement from a large corporation IT department person on the training course:
"Regarding your teaching, first let me mention that I'm always honest--
even if it hurts peoples feelings. So perhaps you will be
grateful to know that I enjoyed your classes and your presentation
style. I liked that you were knowledgeable so you did not need to
just read the slides and you were able to deviate from the slides when
there were related questions from the audience. When a presenter
just reads from the slides I get bored because I'm thinking I can just
read the slides and the presenter is not needed. I also liked the
many hands-on labs. That is how people learn, not just absorbing
info from the slides but applying it. I would recommend you
as a presenter on Hadoop technology and will recommend you for training
at my company if / when we decide to support HBase in a production
setting."
11/11 - 07/12
Cision, Chicago, IL
Big Data consultant
Prototyped
Big Data system with Hadoop, HBase, and Cassandra, Lucene, Solr, Java, R, using Cloudera and
MapR clusters on EC2. Designs complete architecture, assuring
acceptable performance. Analyzes free text marketing information for
influence and marketing impact.
06/11- 11/11
Nor1, Sunnyvale, CA
Big Data consultant
- Assisted with the addition of Hadoop processing to the IT infrastructure;
- Used Sqoop, Flume, Hive, R for analysis of web site traffic;
- Designed and implemented the database layer with Java, Spring, MyBatis, and Maven, Dozer.
03/11-06/11
GHX, Louisville, CO
Big Data consultant
Architected
and prototyped Track and Trace for pharmaceuticals, using Scala,
Cassandra, Hadoop, XML, REST fine-grain access control with
certificates, with capacity of 1,000-10,000 transactions per second,
with background processes to verify chain of custody and fraud
prevention. Tasks accomplished:
- Refactored Cassandra-access code, to allow either Hector or
Thrift access (Factory design pattern), replacing the original Thrift
code interspersed throughout the application;
- Designed Hadoop jobs to verify chain-of-custody and look for fraud indications;
- Prepared multi-cluster test harness on EC2 to exercise the system for performance and failover.
01/11 - 03/11
ChooChee, Mountain View, CA
Big Data consultant
- Reviewed the HDFS usage and system design for future scalability and fault-tolerance;
- Reviewed the HBase NameNode/AvatarNode design for failover;
- Wrote MapReduce/HBase jobs.
08/10 - 12/10
ExtremeTix, Houston, TX
High performance applications consultant
Developed high-performance cache, making the site stable and improving
its performance;
Implemented complex discount logic using Drools.
12/08 - 08/10.
HighGate, New Jersey
Architect
Designed and developed a high-performance cloud-based eDiscovery system
(Java, Hadoop, SimpleDB, RDS). This amounted to a sophisticated legal search application and included the following:
- Designed and architected the complete concept from scratch, based on my study of eDiscovery and on my JD work;
- Prototyped the proof-of-concept with Hadoop in two months;
- Created a complete processing engine, based on Cloudera's
distribution, enhanced to include a custom Amazon Machine Image (AMI),
which served as a release unit and contained all the necessary custom
Linux utilities;
- Created an operator console to start and manage eDiscovery
clusters on EC2; created an operator GUI application, which worked
inside the private cloud and automated all Hadoop and S3 related
operations;
- This
project was later completely re-implemented again from
scratch as an open-source project FreeEed, this time based on
Hadoop,and a choice of NoSQL database (HBase, Cassandra, S3, or
CouchDB).
01/10 - 07/10
Architect at Quiz Revolution,
cloud-based PHP/MySQL/Java.
10/08-08/09 - Senior Developer
Exobox, Houston, TX.
-
Develops text analysis and business intelligence applications, based on
EC2/Hadoop/Nutch. Tasks
include web scraping, document conversion, search index creation,
automatic categorization, duplicate detection, using Java technologies
and open-source projects. Later ported complete infrastructure into the
EC2 cloud.
09/08-10/08
UBS, Houston, TX.
Consultant Developer for Commodities Trading
-
Develops high-performance trading applications, with the
high-reliability, high-performance, multi-threaded framework based on
Spring;
- Signal Suite - real-time high volume (100k+ messages
per/second) data analysis that can be used from algorithmic trading to
system monitoring. Based on Esper for ASP (Event Stream Processing) and
CEP (Complex Event Processing).
07/06-09/08
Merrill Lynch Commodity Trading, Houston, TX.
Senior Developer for eConnect.
Projects included
- re-engineer the system to improve performance and to bring the GUI to
today's look and feel;
- ICE/eConnect integration;
- integration of third-party trade systems.
- intensive testing, bug detection and fixes;using Swing, Java,
Weblogic, JMS, SQLServer, Hibernate, Linux, Windows.
01/05-07/06
BaseBase Corporation, Houston, TX.
Senior Software Engineer / Architect.
Projects included
- a multi-media sharing site with distributed architecture;
- social network with AJAX interactions and multilayer Google Maps
mashup
JBoss, JMS, MySQL, Hibernate, AJAX, Linux, Windows.
06/03-12/06
HyperAlert, Houston, Texas.
Architect/Developer.
Continuously
defined and implemented new features, improved stability, scalability,
and reliability, until HyperAlert became a leading communications
platform for contacting people by phone, email, web, with real-time
response tracking. Used open-standards architecture with Linux, JBoss,
EJB, JSP, AJAX, VoiceXML, MySQL.
10/03-12/04
Lateral Data, Houston, Texas.
Architect and Lead Developer.
Developed
and implemented a software system
for eDiscovery - unique, massively parallel, scalable.
03/03-10/03
ODS_Petrodata, Houston, Texas.
Java developer.
Maintains
and improves various aspects of the ODS Petrodata commercial websites.
The sites are used by subscription by oil companies and energy
operators to plan and execute offshore drilling programs. Technologies
used are Java, Weblogic, XML, JSP, Servlets, and SQLServer.
Specific tasks:
- integrated site search and indexing using open source Lucene,
replacing Inktomi;
01/02-03/03
SHMSoft, Houston, Texas.
Director, lead developer.
Suggests, designs, and implements new software products and
improvements. These include:
-
Translink, optimization energy trading planner, VB/Access, C++ advanced
optimization, for Energistics, LLP, http://energisticsllc.com
-
software package for delivery services with scheduling, dispatching,
payroll, accounting, and web order entry. Currently used by dozens of
people in 4 cities.
07/2001 – 01/2002
Structure Consulting Group, Houston, Texas.
Contract programmer.
Designs and develops applications for deregulated energy markets.
-
Java architect/lead developer for the Trade Manager, which keeps track
of energy trading contracts, energy consumption measurements and
financial settlements, and controls risk management. The technologies
used are Java, Swing, J2EE, Oracle, Tibco, PL/SQL.
07/2000 - 07/2001
Coral Energy, Houston, Texas.
Contract programmer.
Designs and develops applications for on-line energy trading, using
Java/Swing, J2EE, EJB, Weblogic/Oracle, Tibco, Endur.
02/2000 - 07/2000
Emerging.com, Houston, Texas.
Contract programmer.
Builds commercial B2C and B2B websites. Tasks accomplished:
- www.ashford.com rewrite using Java, Servlets, JSP's, WebLogic, WLCS.
06/1999-02/2000
Enron Energy Services, Houston, Texas.
Contract programmer.
Builds
Enron's Common Data Platform (CDP) which brings together all enterprise
data. CDP is based on EJB (Enterprise Java Beans specifications) and
comprises Java and C++ servers, with C++ and Java clients, ObjectStore
database, communicating through CORBA and XML.
04/1999-06/1999
Dresser-Rand, Houston, Texas.
Contract programmer.
Develops
"Global Access", an Internet-based system of remote control over
equipment operation.
04/1996-04/1999
Shell Oil (BTC), Houston, Texas.
Contract programmer.
Develops applications for processing and 3-D modeling and visualization
of exploration data (123DI, Spir3DVIP) on UNIX.
04/1998-06/1999
Mincom, Pty., Houston, Texas.
Contract programmer.
Develops
OpenWorks/Geolog data server (Java, CORBA, PC, UNIX). Suggests and
develops innovative graphical user interface to database objects. The
interface is based on the JGO++ library, and is used to graphically
configure database mapping.
11/1998-06/1999.
Petrophysical Solutions, Inc., Houston, Texas
Contract developer.
Develops complete novel well log data processing applications in Java,
PC, UNIX, and databases.
12/1995-08/1998 (after 04/1996 continuing part time, at 30 hours/week).
Applied Training Resources, Houston, Texas.
Contract programmer.
Develops
Procedure Maker, a multimedia information management system for
petrochemical refineries.
02/1995-12/1995
Western Atlas International, Houston, Texas.
Contract programmer.
Designed and implemented applications for data base storage of well log
data. C++/MFC, VB, Windows, UNIX, WIND/U.
05/1993-02/1995
Oilware, Inc., Houston, Texas.
Contract programmer.
Developed
and implemented a C++ library of 100+ classes for new data exchange
standard (RP66 and DLIS). The volume of 20,000 lines was completely
designed, implemented, and tested in 1.5 years.
08/1984-04/1993
Halliburton Logging Services, Inc., Houston, Texas.
Contract programmer.
Projects accomplished:
- designed and implemented a prototype for an object-oriented
geological database;
- implemented parts of client-custom server for multi-user access of
the above database;
- designed and implemented new computer applications using AI and image
processing.
02/1979-05/1984
Dresser Atlas, Inc., Houston, Texas.
Started
as Systems Analyst, left as Senior Computer Research Specialist.
Received Dresser Industries Golden Creativity Award in 1984.
Projects accomplished:
- new computer applications for log analysis;
- systems for log processing, databases, interactive and hard copy
displays.
Education
Hadoop bootcamp, Redwood City, CA, by ScaleUnlimited, 2009
MapR training with Zaloni, Chicago, IL, 2012
Novus University
School of Law
JD, 2007
St. Petersburg University, Russia.
MS in Math, 1978.
St. Petersburg Electrical Engineering Institute.
MS in Computer Science, 1978.
St. Petersburg 239 Liceum
Certifications
Java Programmer Certification, SUN.
MSCD (C++, VB path) Certification (Microsoft).
Publications & Misc
"Professional Java E-Commerce", WROX, 2002.
"Image Processing in Well Log Analysis", Prentice Press, 1985
Three US Patents for computer software/well log analysis.
Mensa Member since 1983
IEEE Member since 1980