ZODB: The Graph Database for PythonDevelopers

The ZODB is a mature graph database written in Python and optimized in C. Just subclass off of class Persistent Object, and Persistent Container, and your objects, graphs and applications become persistent. This talk teaches you what you need to know to start using a pythonic graph database.

Tags: Web

Scheduled on thursday 11:20 in room cubus


Christopher Lozinski (@PythonLinks)

Christopher lozinski is a serial entrepreneur and MIT graduate. Rather than seek venture capital he moved his ompany to Poland. MQTT is used for hierarchicalchat onPythonLinks.info.


You can see the current version of the slides at https://pythonlinks.info/presentations/zodbtalk.pdf

I invite you to first watch the full but slightly earlier version of the talk at PythonLinks.info/zodb

And then read the following summary to see what else is being added to the talk.

The ZODB is a mature graph database written in Python and optimized in C. Just subclass off of class Persistent Object and Persistent Container, and your objects, graphs and applications become persistent.

The market for Graph Databases has recently exploded, as evidenced by over $200 Million invested in graph database companies. Most of the graph databases are written in Java.

If you are a Python developer, you will find much greater productivity using a graph database written in Python, than one written in statically bound Java. You cannot add or remove an attribute to an object at run-time in a statically typed language. Furthermore, the major Java databases constrain you to one of several persistent data types. Persistent Python, supported by the ZODB allows you to make any Python data structure persistent. Publishing JSON, YAML and Pickles are well supported. GraphQL is conceptually very close to the ZODB schema approach.

Okay, the ZODB is interesting, but is it risky? The ZODB is mature, rock solid and well supported. The ZODB is quite heavily used in the Plone world. Just the government of Brazil has over 100 websites using the ZODB. That includes the President's office, parliament and many other governmet offices. Recently the ZODB has been reengineered. It now supports thousands of write transactions per second.

The major applications of graph databases are fraud detection, social networks and computer networks. NLP is an interesting application area.

The talk reviews the basic concepts of traversal and views on objects.

It is important to understand the basics of how objects are stored on disk. Objects are pickled. There are multiple ways to store those pickles. When using File Storage, the objects in a transaction are appended to he end of the database files. When using relstorage, a record is created with the object id, the version number, and the pickle. The talk reviews how objects are distributed across multiple Python processes. With ZEO the pickles are served across the network. Connections are encrypted. The talk also discusses how to build real-time (chat and iOT) applications using the MQTT message broker with the ZODB.

Performance, scalability, and number of objects, are all discussed. Comparisons are made to traditional relational databases.

The ZODB Demo makes it very easy to start building your own applications on top of the ZODB. You can start by customizing the TreeLeaf, TreeBranch and TreeRoot classes and their templates. You get CRUD for free.

The demo includes traditional relational CRUD, Create, Read, Update, and Delete. But it also includes the extended graph CRUD. Rename a Leaf or Branch. Cut and paste leaves or branches, copy and paste leaves or branches. View and restore historic versions are demonstrated.

Of course the real reason to use a graph database is to improve the user experience. A basic concept in human factors is to limit lists to 7 items. That is why librarians use hierarchy. The Panama Papers journalists said a graph database was more intuitive. Have you ever selected your country from a list of 150 countries. Much better to use a hierarchical list. Have you ever used a Google map with thousands of pins. Much better to have one page for each city.

And of course the most important reason for using a graph database is not what the software does, but how it changes how we humans think about our problems, and how we make decisions. Graph databases enable a different approach to distributing applications across the network. They encourage a different approach to managing the git development process. They enable a different set of decisions to be made.

By the end of this talk, readers should have a much better appreciation for the rich but little known and under appreciated ZODB ecosystem.