Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-36438][PYTHON] Support list-like Python objects for Series comparison #34114

Closed
wants to merge 10 commits into from
Closed

Conversation

itholic
Copy link
Contributor

@itholic itholic commented Sep 27, 2021

What changes were proposed in this pull request?

This PR proposes to implement Series comparison with list-like Python objects.

Currently Series doesn't support the comparison to list-like Python objects such as list, tuple, dict, set.

Before

>>> psser
0    1
1    2
2    3
dtype: int64

>>> psser == [3, 2, 1]
Traceback (most recent call last):
...
TypeError: The operation can not be applied to list.
...

After

>>> psser
0    1
1    2
2    3
dtype: int64

>>> psser == [3, 2, 1]
0    False
1     True
2    False
dtype: bool

This was originally proposed in databricks/koalas#2022, and all reviews in origin PR has been resolved.

Why are the changes needed?

To follow pandas' behavior.

Does this PR introduce any user-facing change?

Yes, the Series comparison with list-like Python objects now possible.

How was this patch tested?

Unittests

@itholic
Copy link
Contributor Author

itholic commented Sep 27, 2021

cc @ueshin @HyukjinKwon @xinrong-databricks

@SparkQA
Copy link

SparkQA commented Sep 27, 2021

Test build #143644 has finished for PR 34114 at commit d930f89.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 27, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48157/

@SparkQA
Copy link

SparkQA commented Sep 27, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48157/

@SparkQA
Copy link

SparkQA commented Sep 29, 2021

Test build #143700 has finished for PR 34114 at commit cbe62da.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 29, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48215/

@SparkQA
Copy link

SparkQA commented Sep 29, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48215/

@SparkQA
Copy link

SparkQA commented Sep 30, 2021

Test build #143760 has finished for PR 34114 at commit fb555b4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 30, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48271/

@SparkQA
Copy link

SparkQA commented Sep 30, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48271/

@SparkQA
Copy link

SparkQA commented Oct 1, 2021

Test build #143780 has finished for PR 34114 at commit 5199684.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 1, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48292/

@SparkQA
Copy link

SparkQA commented Oct 1, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48292/

@SparkQA
Copy link

SparkQA commented Oct 1, 2021

Test build #143784 has finished for PR 34114 at commit 5c9b168.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 1, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48296/

@SparkQA
Copy link

SparkQA commented Oct 1, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48296/

@SparkQA
Copy link

SparkQA commented Oct 5, 2021

Test build #143836 has finished for PR 34114 at commit 57b3c73.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 5, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48349/

@SparkQA
Copy link

SparkQA commented Oct 5, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48349/

@SparkQA
Copy link

SparkQA commented Oct 5, 2021

Test build #143851 has finished for PR 34114 at commit 15a6c20.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 5, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48364/

@SparkQA
Copy link

SparkQA commented Oct 5, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48364/

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks fine. cc @ueshin

@SparkQA
Copy link

SparkQA commented Oct 6, 2021

Test build #143867 has finished for PR 34114 at commit 92753a2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 6, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48380/

@SparkQA
Copy link

SparkQA commented Oct 6, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48380/

@HyukjinKwon
Copy link
Member

Merged to master.

@SparkQA
Copy link

SparkQA commented Oct 13, 2021

Test build #144189 has finished for PR 34114 at commit cf9cc33.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • public class NettyLogger
  • class IndexNameTypeHolder(object):
  • new_class = type(NameTypeHolder.short_name, (NameTypeHolder,),
  • new_class = param.type if isinstance(param, np.dtype) else param
  • class Database(NamedTuple):
  • class Table(NamedTuple):
  • class Column(NamedTuple):
  • class Function(NamedTuple):
  • protected class YarnSchedulerEndpoint(override val rpcEnv: RpcEnv)
  • public final class TableIndex
  • public final class AlwaysFalse extends Filter
  • public final class AlwaysTrue extends Filter
  • public final class And extends BinaryFilter
  • abstract class BinaryComparison extends Filter
  • abstract class BinaryFilter extends Filter
  • public final class EqualNullSafe extends BinaryComparison
  • public final class EqualTo extends BinaryComparison
  • public abstract class Filter implements Expression, Serializable
  • public final class GreaterThan extends BinaryComparison
  • public final class GreaterThanOrEqual extends BinaryComparison
  • public final class In extends Filter
  • public final class IsNotNull extends Filter
  • public final class IsNull extends Filter
  • public final class LessThan extends BinaryComparison
  • public final class LessThanOrEqual extends BinaryComparison
  • public final class Not extends Filter
  • public final class Or extends BinaryFilter
  • public final class StringContains extends StringPredicate
  • public final class StringEndsWith extends StringPredicate
  • abstract class StringPredicate extends Filter
  • public final class StringStartsWith extends StringPredicate
  • public class ColumnarBatch implements AutoCloseable
  • public final class ColumnarBatchRow extends InternalRow
  • class IndexAlreadyExistsException(message: String, cause: Option[Throwable] = None)
  • class NoSuchIndexException(message: String, cause: Option[Throwable] = None)
  • case class Sec(child: Expression)
  • case class Csc(child: Expression)
  • trait OperationHelper extends AliasHelper with PredicateHelper
  • case class AsOfJoin(
  • case class SetCatalogAndNamespace(child: LogicalPlan) extends UnaryCommand
  • case class CreateTempFunction(
  • case class CreateFunction(
  • class SQLOpenHashSet[@specialized(Long, Int, Double, Float) T: ClassTag](
  • case class OptimizeSkewedJoin(
  • case class SkewJoinChildWrapper(plan: SparkPlan) extends LeafExecNode
  • case class SimpleCostEvaluator(forceOptimizeSkewedJoin: Boolean) extends CostEvaluator
  • case class SetCatalogCommand(catalogName: String) extends LeafRunnableCommand
  • case class SetNamespaceCommand(namespace: Seq[String]) extends LeafRunnableCommand
  • case class ShowCurrentNamespaceCommand() extends LeafRunnableCommand
  • case class WriterBucketSpec(
  • case class EnsureRequirements(

@SparkQA
Copy link

SparkQA commented Oct 13, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48668/

@SparkQA
Copy link

SparkQA commented Oct 13, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48668/

@HyukjinKwon
Copy link
Member

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants