Jump to content

Learning Clojure/Meta Data

From Wikibooks, open books for an open world
Previous page
Data Structures
Learning Clojure Next page
Special Forms
Metadata

Official meta-data documentation: http://clojure.org/metadata

Currently this documentation leaves a lot to be desired.

(meta obj)
Returns the meta-data of obj, returns nil if there is no meta-data.
(with-meta obj map)
Returns an object of the same type and value as obj, with map as its meta-data. 

This seems simple enough, but it's not the whole story. In the following section we will go over examples of using meta-data.

Meta-data is used to describe things in Clojure, and it's used everywhere. All of the main functions in Clojure have meta-data in the form of documentation. Two ways of accessing this meta-data are through (find-doc "...") and (print-doc symbol). The difference in these functions is that find-doc takes a string and usually returns a long list of documentation for different functions, while print-doc takes a symbol and prints out that symbols documentation only.

user=> (print-doc find-doc)
-------------------------
clojure.core/find-doc
([re-string-or-pattern])
  Prints documentation for any var whose documentation or name
 contains a match for re-string-or-pattern
nil

so, what does meta-data look like? Since this is a Clojure book, we might as well take a look at meta-data within Clojure itself. The code for the Clojure core can be found at https://github.com/richhickey/clojure/blob/04764db9b213687dd5d4325c67291f0b0ef3ff33/src/clj/clojure/core.clj

The first piece of code in the core is the name-space. this is a typical piece of code at the beginning of most Clojure files, it is a way to prevent name-clashes in different pieces of code that are used together.

(ns ^{:doc "The core Clojure language."
       :author "Rich Hickey"}
  clojure.core)

The meta-data code is "^{:doc "The core Clojure language." :author "Rich Hickey"}" and the code follows the form of (symbol ^meta symbol-with-meta). Meta-data in this form follows the pattern of ^{keyword symbol, keyword symbol}. The meta-data fields in this line of code are :doc and :author and they are applied to the 'clojure.core' symbol. The ^ before the map is a reader macro that expands to (with-meta) So with this knowledge you should be able to do whatever you want with meta-data. It's really easy to implement, as shown above, and it seems like it would be very hard to get wrong. However, lets just do some examples based on what is seen in the core and the official documentation.

In the following examples we'll explore meta-data in finer detail and try to understand why things sometimes work and sometimes don't, and how to troubleshoot meta-data puzzles. First lets get the documentation out of the meta function.

user=> (print-doc meta)
-------------------------
java.lang.NullPointerException (NO_SOURCE_FILE:0)

hmm... (print-doc) does work. Though, (print-doc find-doc) will print out the documentation for (find-doc). So what's up with 'meta'??? Lets try to dig for the answer as to why this simple code isn't working.

user=> (source meta)
(def
 ^{:arglists '([obj])
   :doc "Returns the metadata of obj, returns nil if there is no metadata."
   :added "1.0"}
 meta (fn meta [x]
        (if (instance? clojure.lang.IMeta x)
          (. ^clojure.lang.IMeta x (meta)))))

It turns out that meta does have documentation in the form of meta-data, well maybe we can just try and extract that. Maybe there is a bug in (print-doc).

user=> (meta meta)
{:line 182}

This doesn't make much sense. This is clearly different from the meta-data that we see from the source code of (meta). At least this is a step up from (print-doc)`s null pointer exception. Maybe (meta) is broken, lets try it on something.

user=> (meta find-doc)
{:ns #<Namespace clojure.core>, :name find-doc, :file "clojure/core.clj", :line 3825, :arglists ([re-string-or-pattern]), :added "1.0", :doc "Prints documentation for any var whose documentation or name\n contains a match for re-string-or-pattern"}

It seems like (find-doc) has a lot of meta-data. It has the :line meta-data that we saw from (meta meta), and some other weird stuff like :name, and :arglists.... who in their right mind would put that in a their meta-data?

user=> (source find-doc)
(defn find-doc
  "Prints documentation for any var whose documentation or name
 contains a match for re-string-or-pattern"
  {:added "1.0"}
  [re-string-or-pattern]
    (let [re  (re-pattern re-string-or-pattern)]
      (doseq [ns (all-ns)
              v (sort-by (comp :name meta) (vals (ns-interns ns)))
              :when (and (:doc (meta v))
                         (or (re-find (re-matcher re (:doc (meta v))))
                             (re-find (re-matcher re (str (:name (meta v)))))))]
               (print-doc v))))

It turns out that this function is defined using a (defn) macro. This macro makes entering the documentation meta-data different, it's just a string after the name of the function, and before the body, as well as some map after the string which is any other meta-data you want to add. Yet there are only 3 pieces of meta-data, so either the macro is adding some, or Clojure is adding some, without us knowing. Clojure is probably adding the :name, :file, and :line meta-data, while (defn) does the others :doc, :argslist, and :added being user defined meta-data. A complete list of what meta-data the Clojure compiler adds to objects by default can be found at http://clojure.org/special_forms#Special%20Forms--%28def%20symbol%20init?%29

(print-doc find-doc) works fine, but (print-doc meta) doesn't. (meta meta) is pretty off, and (meta find-doc) seems to work. So lets figure out the difference between (meta) and (find-doc). First it's good to analyze some of the things that happen when we use Clojure.

user=> (def string "my string")
#'user/string

`user/string` is the name-space/identifier of the symbol we just made. Since we are in the `user` name-space we can use the short form `string` to use this symbol. There is some funny stuff at the front of 'user/string' "#'". The ' character is a reader macro, when it's put in-front of a symbol it means to not evaluate it. It turns out that "#'" is another reader macro that is replaced with (var symbol). so #'user/string becomes (var user/string). Looking up the documentation on (var) via (doc var) it directs us to http://clojure.org/special_forms . From this site we are told that (var) is a special form, meaning that it's not written in the core, but is an axiom of the language.

"(var symbol)
The symbol must resolve to a var, and the Var object itself (not its value) is returned. The reader macro #'x expands to (var x)."

So, From this definition it is starting to become more clear as to why (meta meta) didn't work the way we expected. When we use meta it is different from #'meta. One gives us the symbol's value and one gives us the symbol. We want the meta data of the symbol, not of it's value. Lets try some stuff out with this new information.

user=> meta
#<core$meta clojure.core$meta@2e257f1b>
user=> (var meta)
#'clojure.core/meta
user=> (meta (var meta))
{:ns #<Namespace clojure.core>, :name meta, :file "clojure/core.clj", :line 178, :arglists ([obj]), :doc "Returns the metadata of obj, returns nil if there is no metadata.", :added "1.0"}

So, now we are getting the real meta data from (meta). Though, what meta data were we getting with (meta meta)?

user=> (meta meta)
{:line 182}

One gives us :line 178, and the other :line 182. This is because the (def meta) is on line 178, and the value it's pointing to is on line 182. The thing (meta) is pointing to doesn't have any meta data, and it's meta-data is produced by the compiler which gives everything a meta-data of the line it was evaluated on. So, now that we have a better understanding of meta-data we can try out some sample code to test it.

user=> ^{:doc "a number"} 5
java.lang.IllegalArgumentException: Metadata can only be applied to IMetas
user=> ^{:doc "a number"} "5"
java.lang.IllegalArgumentException: Metadata can only be applied to IMetas
user=> ^{:doc "a number"} :5
java.lang.IllegalArgumentException: Metadata can only be applied to IMetas

So, it looks like we can only apply meta data to things with IMeta... So far we know that a (def) will use IMeta, but maybe there are some other things that use it too? Clojure defines some other interesting special forms, *1, *2, *3, which return the first, second and third last things evaluated. The following examples will use these.

user=> [1 2 3]
[1 2 3]
user=> *1
[1 2 3]
user=> ^{:doc "a vector"} *1
[1 2 3]
user=> (meta *1)
nil
user=> ^{:doc "a vector"} [1 2 3 4]
[1 2 3 4]
user=> (meta *1)
{:doc "a vector"}

In the above code we see the enforcement of immutable data, as I can't add meta-data to the vector that I'm referring to with *1 in '^{:doc "a vector"} *1'. However the last lines of input show that we can give an anonymous object meta-data. It will work with the other collections too, not just vectors. The '^{..}' form of adding meta data is a reader macro which converts to (with-meta), an example:

user=> (with-meta '(1 2 3 4 5) {:doc "a list of numbers"})
(1 2 3 4 5)
user=> (meta *1)
{:doc "a list of numbers"}

Sometimes it's easier to read using the macro or the expanded form. Note the difference in position of the meta-data to be added to the object.

meta-data can have meta-data:

user=> (def meta-test (with-meta [1 2 3] {:doc (with-meta [4 5 6] {:doc "vec of 4 5 6"})}))
#'user/meta-test
user=> (:doc (meta meta-test))
[4 5 6]
user=> (:doc (meta (:doc (meta meta-test))))
"vec of 4 5 6"

In the above example I gave meta-date to (meta-test) that contained an array, which in turn contained it's own meta-data. The nesting of meta-data can make for some interesting structures. It can be used to create versions of your data, or whatever you can think of. Though, any use of meta-data in objects beyond simple tagging should be left to macros/functions and be well tested.