Skip to contents

All functions

build_corpus_index()
Build a Parquet ID-lookup index
compatibility_report()
Render and open the compatibility report
extract_doi()
Extract DOIs or Components from Character Vectors
id_block()
Compute ID block from OpenAlex IDs
infer_json_schema()
Infer unified JSON schema using DuckDB
jq_execute()
Execute a jq transformation from an OpenAlex-style JSON to JSONL
lookup_by_id()
Look up records by OpenAlex ID
oa_cache_schema()
Populate the local baseline-schema cache from a snapshot metadata directory
oa_normalize_duckdb_type()
Canonicalise a DuckDB type string.
oa_works_abstract_sql()
Return the DuckDB SQL expression that reconstructs a plain-text abstract from the abstract_inverted_index column in OpenAlex works data.
oa_works_citation_sql()
Return the DuckDB SQL expression that builds a short citation string from the authorships and publication_year columns in OpenAlex works data.
opt_api_key()
Get API key for OpenAlex API
opt_filter_names()
Get available filter names from OpenAlex API
opt_select_fields()
Get available select fields from OpenAlex API
prepare_snapshot()
Prepare a directory for OpenAlex snapshot management
pro_api_key()
Retrieve the OpenAlex Pro API key
pro_download_content()
Download full-text PDFs or TEI XML for OpenAlex works
pro_fetch()
Fetch and convert OpenAlex data to Parquet
pro_query()
Build an OpenAlex request (httr2)
pro_rate_limit_status()
Check OpenAlex rate limit status
pro_request()
Fetch works from OpenAlex
pro_request_jsonl_R()
Convert JSON files to jsonl files
pro_request_jsonl_parquet()
Convert JSON files to Apache Parquet files
pro_request_parquet()
Convert JSON files from pro_request() directly to Apache Parquet
pro_validate_credentials()
Validate OpenAlex credentials
read_corpus()
Read corpus from Parquet Dataset
sample_parquet_n()
Sample rows from Parquet files using DuckDB reservoir sampling
snapshot_to_parquet()
Convert OA snapshot to Parquet format