summaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
authorLudovic Pouzenc <ludovic@pouzenc.fr>2014-09-20 09:17:18 +0200
committerLudovic Pouzenc <ludovic@pouzenc.fr>2015-04-14 07:44:29 +0200
commitd6f22a2af48f83d63b5381118d2029797458194e (patch)
treecb6bef9a98335a7af2aee40b0752d14fcee0916e /README.md
parent774194091e9bcee08e48fcdf4127f9afd9d6d644 (diff)
downloadsssync-d6f22a2af48f83d63b5381118d2029797458194e.tar.gz
sssync-d6f22a2af48f83d63b5381118d2029797458194e.tar.bz2
sssync-d6f22a2af48f83d63b5381118d2029797458194e.zip
Early development stages (before SCM) : WIP_1
Early development stages (before SCM) : WIP_2 Early development stages (before SCM) : WIP_3 Early development stages (before SCM) : WIP_4 Early development stages (before SCM) : WIP_6 Early development stages (before SCM) : WIP_7 Early development stages (before SCM) : WIP_8 Adds documentation folder as an Eclipse project. Adds README for github. Decent source tree by tuning Eclise project's location One forgetten file while movign everything :) Adding Copyright, licencing (GPL v3), correcting README
Diffstat (limited to 'README.md')
-rw-r--r--README.md62
1 files changed, 61 insertions, 1 deletions
diff --git a/README.md b/README.md
index 1333ed7..a5bbd13 100644
--- a/README.md
+++ b/README.md
@@ -1 +1,61 @@
-TODO
+# SSSync
+SSSync, a Simple and Stupid Synchronizer for data with multi-valued attributes.
+
+## What is SSSync?
+Simple and Stupid Synchronizer performs one-way synchronisation of any data that follow the "key => attributes => values set" data model. Each synchronisation task could have multiple sources and one destination. The multi-valued data model of SSSync is inspired of the one used for LDAP entries. Any stricter model is usable with SSSync, notably SQL resultset and more generally anything that sounds like a data table.
+
+### Functionnal facts
+ - Meant to be started periodically (ie. nightly)
+ - No daemon, no persistent things, no data change tracking
+ - Dry-run mode and safeguards (max exec time, max insert/update/delete operations)
+ - Minimal data interpretation and mapping (SQL already got a marvellous "as" keyword)
+ - Validate data against schemas, log problems in human-readable form
+ - Never rewrites an already up-to-date data (master/slave friendly)
+ - SysAdmin-friendly (crontab-aware, well-defined exit codes, log verbosity)
+
+### Technical facts
+ - Structured configuration files with YaML : simple and stupid, like the rest
+ - No embded script language, no XML, no ORM mappings
+ - Basically performs one full read of all sources at each run
+ - Small code base (2k SLOC), low algorithmic complexity, key-sorted reads and comparisons
+ - Memory footprint is low (< 64 Mio) whatever input data size
+ - Performs within seconds, throughput typically limited by destination write rate
+ - Written in Java, uses great libraries like [https://www.unboundid.com/products/ldapsdk Unbound ID's LDAP SDK]
+ - Could be self-contained in a single directory like portable apps
+ - Packaging for Linux (.deb, .rpm) and Windows (.msi) are planned
+ - Will probably never eat up more than 20 Mio of disk
+
+## Connectors
+### Already shipped-in
+ - OpenLDAP (source/dest)
+ - MySQL (source)
+ - Oracle (source)
+ - Fixed format CSV (source)
+### Not yet done connectors
+ - JDBC writer
+ - LDIF reader (painfull because they could mix data and instructions)
+ - Active Directory (because of the lack of real test environment)
+ - Arbitrary CSV format (lack of formalism implies huge dev/test effort)
+
+## Limitations (of current release)
+ - Values are represented and compared as Java Strings (so, UTF-16)
+ - Don't expect much about binary blobs or non-printable things
+ - No password hashing utilities (but nobody still store cleartext passwords, do you ?)
+ - No data manipulation and transformation in SSSync (maybe it's a feature)
+
+## SSSync through examples
+### Medium-sized university LDAP directory
+ - Context : 10000 people, 1000 groups and structures, 4 different sync tasks
+ - Main data source : human resources system stored in an Oracle database
+ - Additionnal sources : 2 CSV (technical accounts, manual corrections)
+ - Destination : OpenLDAP (master node, with many replicas via syncrepl)
+ - Full run time including dry-run passes : 20 seconds
+(even in september when there is 400 new students and 50 employee updates to sync per night)
+<p align="center">
+ <img src="http://www.pouzenc.fr/projects/sssync/SSSync_Doc/diagrams/example1_flow.svg"
+ alt="SSSync process with 3 combined sources : Oracle + 2 CSV, one destination : slapd"
+ width="50%"/>
+</p>
+
+### Give me more examples
+Please conact me if you have more examples to put here.