Important: This API is currently in beta testing and is not meant to be used in production as it is likely to change (break) without notice. Please get in touch with your questions, feedback and suggestions for improvements. Please also sign up to the API mailing list (see bottom of this page) to stay updated.
The metadata API shares many of the design concepts as the the search API. Please read the search API's documentation in conjunction with this page.
The API is designed to be very simple to use. You construct a URL that you retrieve the contents of, and it returns a JSON output of the search results.
The API is versioned, meaning that each API endpoint specifies the version number. This maintains backwards compatability in that when the API gets updated in a way that would break the default behavior, it gets a new version number, and your application will not break.
The API outputs only JSON. Support for XML responses are not planned unless there is significant demand for it. See the Mailing List and Getting Help section below for contact details.
Currently the API does not support JSONP. Given the use case, it is unlikely JSONP support will be added. However, we are happy to reconsider if there is sufficient demand. Contact us (details at the bottom) to talk about this.
The API endpoint is of this format:
In detail:
To get the metadata for the first 100 entries in the database:
Pagination to get the next 100:
These two API calls are 100% equivalent (no specified page implies page 1):
To get the metadata of courses from a specific institution (MIT in this example):
The institution variable is identical to the institution: advanced search operator.
The JSON response returns the following:
Note that dividing and rounding up (TotalResults/100) gives you the number of pages that will return all of the data.
If no results are found, you will get an HTTP response status of 204 No Content. For example:
So basic usage: create a loop that increments the page variable and calls the API. The moment you get a 204, you're done. Currently there are 25 pages:
Don't forget to change the contact variable to be yours!
VERY important note: The API returns an ordered list of courses using the internal numeric ID for each course called UniqueID, which is NOT a serial number (how it's derived is out of scope for this document). The upshot is that if more courses are added (or removed), the ordered list of UniqueID values will change, so the results you get with the API will change their order. The metadata API currently returns the UniqueID value for each course so you can store it and check if you already know about it in later crawls. This behavior may change in the future, and the mailing list will announce any changes if they occur.
Please implement results caching in your web application if possible. OCW Search index updates occur every few weeks and so the results for a query will not change in the meantime.
At the moment there isn't a hard rate limiter in place, but the API usage is monitored. Heavy users will be contacted (that is why we need the contact details in each API call), and abusers will be banned out-right.
As the API output is JSON, any JSON-aware debugging tool will work. Here is a selection:
For contacting us privately, please email apicontact (at) ocwsearch (dot) com.
We also maintain a mailing list for anyone using or interested in the API. You can subscribe using this form:
| |
| Subscribe to OCW Search API |
| Visit this group |
The mailing list is for pretty much anything API related: