Skip to content
This repository has been archived by the owner on Sep 17, 2023. It is now read-only.

bmazzarol/TypedSpark.NET

Repository files navigation

TypedSpark.NET

TypedSpark.NET

Due to inactivity and lack of support for spark.NET this has been archived. I would recommend building Spark applications in a supported language, not in dotnet.

🏃 Getting Started 📚 Documentation

Nuget Coverage Quality Gate Status CD Build Check Markdown CodeQL

Typesafe bindings for ⭐ Spark.NET

IMPORTANT: Please note this library under active construction :construction_worker: and should not be used in production. Help is always appreciated, create an issue, check the code out and have a play!

Features

  • Check Spark programs at compile time
  • Zero dependencies (except spark dotnet!)
  • Easy to use, its LINQ for Spark
  • Replace stringly typed code with strong models
  • No more APIs untyped Spark APIs
// create a model using typed columns
public sealed Person: TypedSchema<Person>
{
    public StringColumn Name { get; private set; }
    public IntegerColumn Age { get; private set; }
    
    public Person(string? alias): base(alias) {}
    public Person(): this(default) {}
}

// now it can be used in typed query operations using LINQ
DataFrame df;
TypedDataFrame<Person> personDf = df.AsTyped<Person>();

personDf
    .Where(x => x.Age > 18)
    .OrderBy(x => new { Age = x.Age.Desc() })
    .Select(x => new { PersonName = x.Name });

// more to come!!

Getting Started

Coming Soon.

Why?

Strong types facilitate better code, Spark is typed, leveraging the C# type system we can expose those types and enforce correct Spark applications before they are even run.

More details to come!

Attributions

Fire icons created by juicy_fish

Releases

No releases published

Packages

No packages published

Languages