Package: CDMConnector 2.6.0

Ger Inberg

CDMConnector: Connect to an OMOP Common Data Model

Provides tools for working with observational health data in the Observational Medical Outcomes Partnership (OMOP) Common Data Model format with a pipe friendly syntax. Common data model database table references are stored in a single compound object along with metadata.

Authors:Ger Inberg [aut, cre], Adam Black [aut], Artem Gorbachev [aut], Edward Burn [aut], Marti Catala Sabate [aut], Ioanna Nika [aut]

CDMConnector_2.6.0.tar.gz
CDMConnector_2.6.0.zip(r-4.7)CDMConnector_2.6.0.zip(r-4.6)CDMConnector_2.6.0.zip(r-4.5)
CDMConnector_2.6.0.tgz(r-4.6-any)CDMConnector_2.6.0.tgz(r-4.5-any)
CDMConnector_2.6.0.tar.gz(r-4.7-any)CDMConnector_2.6.0.tar.gz(r-4.6-any)
CDMConnector_2.6.0.tgz(r-4.6-emscripten)
manual.pdf |manual.html
DESCRIPTION |NEWS
card.svg |card.png
CDMConnector/json (API)

# Install 'CDMConnector' in R:
install.packages('CDMConnector', repos = c('https://darwin-eu.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/darwin-eu/cdmconnector/issues

Pkgdown/docs site:https://darwin-eu.github.io

On CRAN:

Conda:

11.75 score 16 stars 9 packages 698 scripts 2.0k downloads 62 exports 44 dependencies

Last updated from:3e7ab111ea. Checks:9 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK333
source / vignettesOK240
linux-release-x86_64OK332
macos-release-arm64OK198
macos-oldrel-arm64OK231
windows-develOK361
windows-releaseOK332
windows-oldrelOK373
wasm-releaseOK133

Exports:%>%appendPermanentasDateattritionbenchmarkCDMConnectorbindcdmCommentContentscdmConcdmDisconnectcdmFlattencdmFromCohortSetcdmFromConcdmFromTablescdmNamecdmReferencecdmSamplecdmSelectcdmSubsetcdmSubsetCohortcdmTrimVocabularycdmVersioncdmWriteSchemacohdSimilarConceptscohortCodelistcohortCountcomputecomputeDataHashByTablecomputeQuerycopyCdmTodateadddatediffdatepartdbmsdbSourcedownloadEunomiaDatadropSourceTabledropTableeunomiaDireunomiaIsAvailableexampleDatasetsgenerateCohortSetgenerateCohortSet2generateConceptCohortSetinSchemainsertCdmToinsertTableinsertTableSparklistSourceTableslistTablesnewCohortTablereadCohortSetreadSourceTablerecordCohortAttritionrequireEunomiasettingssnapshotsourceTypesummariseQuantilesummariseQuantile2tblGroupuniqueTableNameversion

Dependencies:askpassbackportsbitbit64blobcheckmateclicliprcpp11crayoncurlDBIdbplyrdplyrgenericsgluehmshttr2jsonlitelifecyclemagrittromopgenericsopensslpillarpkgconfigprettyunitsprogresspurrrR6rappdirsreadrrlangsnakecasestringistringrsystibbletidyrtidyselecttzdbutf8vctrsvroomwithr

DAG-Based Batch Optimization for Cohort SQL Generation
Overview | The One-at-a-Time Approach | The Optimized Batch DAG | Worked Example: Three Drug Exposure Cohorts | One-at-a-Time DAG | Optimized Batch DAG | How Equivalence Is Guaranteed | 1. Same Builder, Different Assembly | 2. Domain Filtering Includes Source Concepts | 3. QualifiedLimit Preservation | 4. Automated Equivalence Validation | Safety Mechanisms | Performance Characteristics | By Workload Type | Independent Cohorts (Different Domains) | Independent Cohorts (Overlapping Domains) | Subset Cohorts (Same Primary Criteria, Different Inclusion Rules) | Large Phenotype Libraries (100+ Cohorts) | Summary Table | Code Example | Single Cohort (No Optimization) | Batch (Optimized) | With CDMConnector | Validating Equivalence | Debugging | Appendix: How the DAG Is Enforced

Last update: 2026-03-09
Started: 2026-03-08

Incremental DAG Caching for Cohort Generation
Overview | How It Works | The Execution DAG | Merkle-Tree Hashing | The Cache Registry | Stable Table Naming | Cache-Aware Execution | Usage | Basic Usage | SQL-Only Usage | What Gets Cached vs. Not | Cached (persistent across runs) | Not Cached (rebuilt every run) | Why Concept Sets Aren't Individually Cached | Cache Management | Inspecting the Cache | Garbage Collection | Clearing the Cache | Correctness Guarantees | When to Use Caching | Example: Incremental Update

Last update: 2026-03-08
Started: 2026-03-08

Multi-Database Benchmarking: Old vs New Cohort Generation
Overview | Performance improvements with the new approach | How to run the benchmark | Prerequisites | Single database | Multiple databases | Benchmark results CSV (timing) | Equivalence CSV (same results) | Summary

Last update: 2026-03-08
Started: 2026-03-08

DBI connection examples
Postgres | Redshift | SQL Server | Snowflake | Databricks/Spark | Duckdb

Last update: 2026-02-12
Started: 2023-03-06

Getting Started
Creating a reference to the OMOP CDM | Joining tables | Saving query results to the database | Selecting a subset of CDM tables | Subsetting a CDM | Flatten a CDM | Closing connections | Summary

Last update: 2025-07-10
Started: 2023-03-06

Working with cohorts
Cohort Generation | Atlas cohort definitions | Subset a cohort | Custom Cohort Creation

Last update: 2025-07-10
Started: 2024-01-22

CDMConnector and dbplyr
Set up | Creating the cdm reference | Putting it all together | Behind the scenes

Last update: 2025-02-10
Started: 2023-03-06

Using CDM attributes
Set up | CDM reference attributes | CDM name | CDM version | Database connection | Cohort attributes | Generated cohort set | Creating a bespoke cohort

Last update: 2025-02-10
Started: 2023-08-01

Readme and manuals

Help Manual

Help pageTopics
Run a dplyr query and add the result set to an existingappendPermanent
as.Date dbplyr translation wrapperasDate
Run benchmark of tasks using CDMConnectorbenchmarkCDMConnector
Insert Patient CDM Contents as Aligned Comments in RStudiocdmCommentContents
Get underlying database connectioncdmCon
Disconnect the connection of the cdm objectcdmDisconnect.db_cdm
Flatten a cdm into a single observation tablecdmFlatten
Build a Synthetic CDM from a Cohort SetcdmFromCohortSet
Create a CDM reference object from a database connectioncdmFromCon
Subset a cdm object to a random sample of individualscdmSample
Subset a cdm object to a set of personscdmSubset
Subset a cdm to the individuals in one or more cohortscdmSubsetCohort
Trim vocabulary tables to the minimum needed for the CDMcdmTrimVocabulary
Get cdm write schemacdmWriteSchema
Get similar concepts from Columbia Open Health Data (COHD) APIcohdSimilarConcepts
Compute a hash for each CDM tablecomputeDataHashByTable
Execute dplyr query and save result in remote databasecomputeQuery
Copy a cdm object from one database to anothercopyCdmTo
Add days or years to a date in a dplyr querydateadd
Compute the difference between two daysdatediff
Extract the day, month or year of a date in a dplyr pipelinedatepart
Get the database management system (dbms) from a cdm_reference or DBI connectiondbms
Create a source for a cdm in a database.dbSource
Download Eunomia data filesdownloadEunomiaData
Drop table from a database backed cdm objectdropTable.db_cdm
Create a copy of an example OMOP CDM dataseteunomiaDir
Has the Eunomia dataset been cached?eunomiaIsAvailable
List the available example CDM datasetsexampleDatasets
Generate a cohort set on a cdm objectgenerateCohortSet
Generate a cohort set on a CDM object (optimized, no Java dependency)generateCohortSet2
Create a new generated cohort set from a list of concept setsgenerateConceptCohortSet
Helper for working with compound schemainSchema
Fast bulk insert of a local table on Spark / DatabricksinsertTableSpark
List tables in a schemalistTables
Read a set of cohort definitions into RreadCohortSet
Require eunomia to be available. The function makes sure that you can later create a eunomia database with 'eunomiaDir()'.requireEunomia
Extract CDM metadatasnapshot
Quantile calculation using dbplyrsummariseQuantile
Quantile calculation using dbplyrsummariseQuantile2
CDM table selection helpertblGroup
Get the CDM versionversion