Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable reading of pbfs with node location for ways #111

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

sergiupantiru
Copy link

@sergiupantiru sergiupantiru commented Jun 27, 2023

Fixes #110

@sergiupantiru sergiupantiru marked this pull request as ready for review June 27, 2023 07:43
Copy link
Member

@angelcervera angelcervera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, thanks for your updates and apologies for the delay doing the CR. So busy these days.

Important. I know that there are a lot of comments, but in general (except few things), it looks good to me. ;)

The PR is not passing one of the checkers. Looks like it is not compiling for Scala 2.13.
To reproduce it, you can run PATCH_211=false sbt +test or from the sbt shell:

sbt:osm4scala-root> ++ 2.13.5!
sbt:osm4scala-root> clean
sbt:osm4scala-root> compile

Maybe you can fix as well the warning.

[error] /home/angelcc/projects/tmp/osm4scala/core/src/main/scala/com/acervera/osm4scala/utilities/CoordUtils.scala:52:19: Double does not take parameters
[error]       .doubleValue();
[error]                   ^
[error] /home/angelcc/projects/tmp/osm4scala/core/src/main/scala/com/acervera/osm4scala/utilities/CoordUtils.scala:66:19: Double does not take parameters
[error]       .doubleValue();
[error]                   ^
[warn] /home/angelcc/projects/tmp/osm4scala/core/src/main/scala/com/acervera/osm4scala/model/model.scala:110:48: Widening conversion from Long to Double is deprecated because it loses precision. Write `.toDouble` instead.
[warn]         .drop(1).map(x=> convertToMicroDegrees(x)),
[warn]                                                ^
[warn] /home/angelcc/projects/tmp/osm4scala/core/src/main/scala/com/acervera/osm4scala/model/model.scala:113:49: Widening conversion from Long to Double is deprecated because it loses precision. Write `.toDouble` instead.
[warn]         .drop(1).map(x => convertToMicroDegrees(x))
[warn]                                                 ^
[warn] two warnings found

Anyway, I think that after that, it will fail because the dependency with Spark 3.1.1, but at least, let keep it compatible with Scala 2.13
Don't worry about that. I will fix it before the release.

BTW, I'm thinking about drop Scala 2.11 compatibility because we are not using it and it's a pain for crosscompiling and dependencies, but no idea if other people is still using it. Could you give me a feed back about this?

* @param currentValue
* @return
*/
def decompressCoord(offSet: Long, delta: Long, currentValue: Double, granularity: Int): Double = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe these changes are a good idea to increase precision, but are out of the scope of adding the new fields.
Could you move it out of this PR? Before to add it, we need to check if so precision is necessary for the precision level used in OSM and also the performance penalty that we are adding.

BTW, I'm not sure, but I think that the compiler is not going to optimize the code enough so what in fact are constants (like (BigDecimal.valueOf(1E-9))) are going to be created every time you execute the function (so billions times in this case).

So what do you think about remove these changes from this PR and create a new one to talk about it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have removed them

Copy link
Member

@angelcervera angelcervera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A part of the compilation error, here another general comment. Could you update the documentation with the new schema and so on. That would be great as well.
If don't have time to do it, I can do it in the future. So don't worry too much.

@@ -122,7 +122,12 @@ class OsmPbfFormat extends FileFormat with DataSourceRegister {

override def inferSchema(sparkSession: SparkSession,
options: Map[String, String],
files: Seq[FileStatus]): Option[StructType] = Some(OsmSqlEntity.schema)
files: Seq[FileStatus]): Option[StructType] =
if (options.getOrElse("wayWithGeometry","false").toLowerCase().equals("true")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep only one schema.
After this change , I think that I will increase the major version of the library, because back compatibility (and I will take this chance to drop Scala 2.11 comp).
I don't think that adding two nullable columns will affect too much.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to keep it backward compatible. I can keep only one if you think it's better

@@ -88,7 +89,9 @@ case class WayEntity(
id: Long,
nodes: Seq[Long],
tags: Map[String, String],
info: Option[Info] = None
info: Option[Info] = None,
lat: Seq[Double],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding default values as Seq.empty to simplify migration to the new version? That will make other users happier than if they need to deal with compilation errors. ;)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep this as before, but adding the new two nullable fields. My opinion in the other comment about the schema.

case FIELD_TAGS => calculateTags(entity.tags)
case FIELD_INFO => entity.info.map(populateInfo).orNull
case fieldName => throw new Exception(s"Field $fieldName not valid for a Way.")
private def populateWay(entity: WayEntity, structType: StructType): Seq[Any] = structType.fields.map(f =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't understand this change. Maybe I'm missing something. Why do you prefer .fields over .fieldNames when the only use of field is to get the name so we are adding an extra step in the code? At the end it is exactly the same, but, in my opinion, removing extra step is more readable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the same as for Relations, the type of the field is needed

@angelcervera
Copy link
Member

Hi @sergiupantiru Tests are still failing. Do you need help?

@sergiupantiru
Copy link
Author

Hi @sergiupantiru Tests are still failing. Do you need help?

Hey, if you have time I always welcome help. I'm not close to a computer for the next 3 weeks.

@angelcervera
Copy link
Member

Hi @sergiupantiru Tests are still failing. Do you need help?

Hey, if you have time I always welcome help. I'm not close to a computer for the next 3 weeks.

No rush, take your time.

@sonarcloud
Copy link

sonarcloud bot commented Aug 10, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@sergiupantiru
Copy link
Author

can you please take another look?

@sergiupantiru
Copy link
Author

hey, any updates on this?

@angelcervera
Copy link
Member

angelcervera commented Feb 12, 2024

@sergiupantiru All looks fine. I remember that I checked it locally as well, but I want to be sure. Next week I will expend time on it and I will create a new release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

enable reading of pbfs with node location for ways
2 participants