Data and tools for generating and inspecting OLMo pre-training data.
-
Updated
May 15, 2024 - Python
Data and tools for generating and inspecting OLMo pre-training data.
Numerical implementation of the article "Inverses and n-uncial property of Jacobian elliptic functions" (Solanilla, Leal, Tique, 2021) using Python. Data simulating and creation maps of the Earth using n-uncial projections, derived from the generalization of the quincuncial property.
USC DSCI 560 - Data Science Professional Practicum - Spring 2024 - Prof. Young Cho
Python Stream Processing
OpenSource data platform to build event-driven systems. It's like Deebezium for golang :)
An intuitive and flexible RDF pipeline solution designed to simplify and automate ETL processes for efficient data management.
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
A collection of actions for working with ROS data
📋 Acceptance testing of rules authored by the ACT Rules Community Group (@act-rules) and implemented by Alfa
♿ Suite of open and standards-based tools for performing reliable accessibility conformance testing at scale
Efficient batch processing for node.js and browsers
Open pixelated STEM framework
Desktop GUI for SNAP based on NetBeans Platform
A collection of Python tools
A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀
A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀
In this project, I made a program using Python to explore The Movie Database (TMDB), I wrote code to import the data then cleaning it, and answer interesting questions about it by computing descriptive statistics and Visualize it to facilitate understanding it and using NumPy, Matplotlib, and pandas.
A light-weight, flexible, and expressive statistical data testing library
Kubernetes-native platform to run massively parallel data/streaming jobs
Add a description, image, and links to the data-processing topic page so that developers can more easily learn about it.
To associate your repository with the data-processing topic, visit your repo's landing page and select "manage topics."