Skills zarr-python
📦

zarr-python

Safe ⚙️ External commands🌐 Network access

Store large N-dimensional arrays efficiently

Also available from: davila7

Working with large datasets that exceed memory limits. Zarr-python enables chunked array storage with compression for efficient cloud-native scientific computing workflows.

Supports: Claude Codex Code(CC)
📊 69 Adequate
1

Download the skill ZIP

2

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

3

Toggle on and start using

Test it

Using "zarr-python". Create a Zarr array for storing temperature data with 365 time steps, 720 latitudes, and 1440 longitudes.

Expected outcome:

  • Created Zarr array at 'temperature.zarr'
  • Shape: (365, 720, 1440) | Chunks: (1, 720, 1440) | Dtype: float32
  • Compression: Blosc (zstd, level 5) with shuffle filter
  • Each chunk contains one complete daily snapshot (~4MB)
  • Use z.append() to add new time steps efficiently

Security Audit

Safe
v4 • 1/17/2026

All 227 static findings are FALSE POSITIVES. The analyzer misidentified markdown documentation content as security vulnerabilities. Backticks in markdown are code formatting, not shell execution. Compression codec names (zstd, gzip, lz4) were flagged as cryptographic algorithms but are data compression. URLs are legitimate documentation links. No executable code, shell commands, or cryptographic operations exist in these documentation files.

4
Files scanned
2,641
Lines analyzed
2
findings
4
Total audits

Risk Factors

⚙️ External commands (2)
🌐 Network access (1)
Audited by: claude View Audit History →

Quality Score

41
Architecture
100
Maintainability
81
Content
30
Community
100
Security
74
Spec Compliance

What You Can Build

Store climate model data

Store terabyte-scale climate data with time dimensions. Enable efficient appending of new timesteps.

Manage model checkpoints

Store large embedding matrices and model weights. Integrate with Dask for distributed training.

Process genomic datasets

Handle multi-terabyte genomic arrays. Use cloud storage for collaboration.

Try These Prompts

Basic Array Setup
Create a Zarr array with shape (10000, 10000), chunks of (1000, 1000), and float32 dtype. Store it at data/my_array.zarr.
Cloud Storage
Set up a Zarr array stored in S3 with s3fs. Use bucket my-bucket and path data/arrays.zarr.
Dask Integration
Load a Zarr array as a Dask array and compute the mean along axis 0 in parallel.
Performance Tuning
Create a Zarr array optimized for cloud storage: 10MB chunks, consolidated metadata, and sharding enabled.

Best Practices

  • Choose chunk sizes of 1-10 MB for optimal I/O performance
  • Align chunk shape with your data access pattern (e.g., time-first for time series)
  • Consolidate metadata when using cloud storage to reduce latency

Avoid

  • Avoid loading entire large arrays into memory - process in chunks
  • Do not use small chunks (<1MB) as they create excessive metadata overhead
  • Avoid frequent writes to the same cloud storage location without synchronization

Frequently Asked Questions

What is the difference between Zarr v2 and v3 formats?
V3 supports sharding and has improved metadata. V2 is widely compatible with older tools. Zarr auto-detects format.
How do I choose the right chunk size?
Target 1-10 MB per chunk. For float32 data, 512x512 elements equals approximately 1 MB.
Can Zarr handle arrays larger than available memory?
Yes. Zarr only loads chunks needed for current operations. Use Dask for parallel out-of-core processing.
What compression should I use?
Use Blosc with lz4 for speed, zstd for balanced compression, or gzip for maximum compression ratio.
How does Zarr compare to HDF5?
Zarr offers simpler cloud integration, better metadata handling, and native support for parallel access patterns.
Can I use Zarr with existing HDF5 files?
Yes. Use h5py to read HDF5 files and zarr.array() to convert them to Zarr format.