Clerk can be very helpful when exploring any kind of data, including the sorts of things for which we might turn to the Semantic Web. To give a sense of what that's like, this notebook gives some examples of querying WikiData for facts about the world.
First, we bring in Clerk, the Clerk viewer helpers, Mundaneum (a WikiData wrapper that uses a Datomic-like syntax), and Arrowic (to draw graphviz-style box-and-arrow graphs).
Now we can ask questions, like "what is James Clerk Maxwell famous for having invented or discovered?"
The WikiData internal ID :wd/Q1080745
doesn't immediately mean much to a human, so we'll try again by appending Label
to the end of the ?what
logic variable so we can see a human readable label that item:
Ah, better. 😊 This ceremony is required because WikiData uses a language-neutral data representation internally, leaving us with an extra step to get readable results. This can be a little annoying, but it does have benefits. For example, we can ask for an entity's label in every language for which it has been specified in WikiData:
One of the nice things about data encoded as a knowledge graph is that we can ask questions that are difficult to pose any other way, then receive answers as structured data for further processing.
Here, for instance, is a query asking for things discovered or invented by anyone who has as one of their occupations "physicist":
It's great that we can retrieve this information as a sequence of maps that we can explore interactively in Clerk, but sometimes it's more pleasant to display data organized in a table view:
:whatLabel | :whomLabel |
---|---|
World Wide Web | Tim Berners-Lee |
Hypertext Transfer Protocol | Tim Berners-Lee |
HyperText Markup Language | Tim Berners-Lee |
Semantic Web | Tim Berners-Lee |
WorldWideWeb | Tim Berners-Lee |
Io | Galileo Galilei |
Callisto | Galileo Galilei |
Europa | Galileo Galilei |
Ganymede | Galileo Galilei |
Trapezium Cluster | Galileo Galilei |
Galilean transformation | Galileo Galilei |
solar variation | Galileo Galilei |
Square-cube law | Galileo Galilei |
Q1535340 | Galileo Galilei |
Galilean micrometer | Galileo Galilei |
tholin | Carl Sagan |
carbon chauvinism | Carl Sagan |
Encyclopedia Galactica | Carl Sagan |
fluorine | André-Marie Ampère |
Ampère's force law | André-Marie Ampère |
480 more elided |
Once we see how a given table looks, we might decide that it would be better if, for example, these inventions were grouped by inventor. This is just the sort of thing that Clojure sequence functions can help us do:
Julius Plücker | Plücker surface | ||
Paul Dirac | Dirac equation ; Dirac spinor ; Dirac large numbers hypothesis | ||
Hans Geiger | Geiger counter | ||
Karl Weierstraß | (ε, δ)-definition of limit ; Q111354260 | ||
Friedrich Paschen | Paschen series | ||
Pierre Curie | polonium ; radium ; piezoelectricity ; Curie temperature | ||
Alexander Lippisch | variometer | ||
Friedrich Kohlrausch | Kohlrausch bridge | ||
Johannes Stark | Stark effect | ||
Karl Schwarzschild | Schwarzschild telescope ; Schwarzschild effect | ||
Johann Christian Poggendorff | potentiometer | ||
J. J. Thomson | electron ; electromagnetic waveguide ; discovery of electrons | ||
Hendrik Lorentz | Lorentz transformation | ||
Hans Christian Ørsted | aluminium ; Oersted's law | ||
Leonardo da Vinci | Leonardo's robot ; sfumato ; Coulomb friction ; Leonardo's crossbow ; Leonardo's45 more elided | | |
Otto von Guericke | Magdeburg hemispheres ; baroscope | ||
Anaximander | spontaneous generation | ||
Pierre-Simon Laplace | nebular hypothesis | ||
Albert Einstein | general relativity ; special relativity ; mass–energy equivalence ; theory of re157 more elided | | |
David Hilbert | epsilon calculus ; Hilbert's nineteenth problem | ||
67 more elided |
Some data are more naturally viewed in other ways, of course. In this example we find every instance of any subclass of "human settlement" (village, town, city, and so on) in Germany that has a German language placename ending in -ow or -itz, both of which indicate that it was originally named by speakers of a Slavic language.
The :coordinate-location
in this query is the longitude/latitude position of each of these places in a somewhat unfortunate string fomat. The mapv
at the end converts these lonlat
strings into key/value pairs so Vega can plot the points on a map. This gives us a very clear picture of which parts of Germany were Slavic prior to the Germanic migrations:
Sometimes the data needs a more customized view. Happily, we can write arbitrary hiccup to be rendered in Clerk. We'll use this query to fetch a list of different species of Apodiformes (swifts and hummingbirds), returning the name in English and Japanese, an image of the bird itself, and map of that bird's home range for each one.
English | Japanese | Photo | Range |
---|---|---|---|
Bee Hummingbird | マメハチドリ | ||
Bee Hummingbird | マメハチドリ | ||
Sapphire-bellied Hummingbird | ルリハラハチドリ | ||
Blue-fronted Lancebill | スミレビタイヤリハチドリ | ||
Broad-tailed Hummingbird | フトオハチドリ | ||
Blue-capped Puffleg | ズアオワタアシハチドリ | ||
Broad-billed Hummingbird | アカハシハチドリ | ||
Band-tailed Barbthroat | オビオヒゲハチドリ | ||
Hoary Puffleg | ハイイロアシゲハチドリ |
Another useful technique when dealing with semantic or graph-shaped data is to visualize the results as a tree. Here we gather all the languages influenced by Lisp or by languages influenced by Lisp (a transitive query across the graph), and visualize them in a big network diagram.
Because Clerk's html
viewer also understands SVGs, we can just plug in an existing graph visualization library and send the output to Clerk.
The graph is really huge, so you'll need to scroll around a bit to see all the languages.
I hope this gives you some ideas about things you might want to try!