Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e.g.: point.extensions.get("hr") #119

Open
jedie opened this issue Jun 6, 2018 · 11 comments
Open

e.g.: point.extensions.get("hr") #119

jedie opened this issue Jun 6, 2018 · 11 comments

Comments

@jedie
Copy link
Contributor

jedie commented Jun 6, 2018

It would be cool to easier get values from gpx extensions.

e.g.:

      <trkpt lat="51.43788929097354412078857421875" lon="6.617012657225131988525390625">
        <ele>23.6000003814697265625</ele>
        <time>2018-02-21T14:30:50.000Z</time>
        <extensions>
          <ns3:TrackPointExtension>
            <ns3:hr>125</ns3:hr>
            <ns3:cad>75</ns3:cad>
          </ns3:TrackPointExtension>
        </extensions>
      </trkpt>
>>> point.extensions.get("hr")
'125'
>>> point.extensions.get("cad")
'75'

Don't know if it possible to get integers here. Is somewhere the information about the extension types?!?

@jedie
Copy link
Contributor Author

jedie commented Jun 12, 2018

I have now made this:

def get_extension_data(gpxpy_instance):
    """
    return a dict with all extension values from all track points.
    """
    extension_data = collections.defaultdict(list)

    for track in gpxpy_instance.tracks:
        for segment in track.segments:
            for point in segment.points:
                extensions = point.extensions
                if not extensions:
                    return None

                for child in extensions[0].getchildren():
                    tag = child.tag.rsplit("}", 1)[-1] # FIXME

                    value = child.text
                    try:
                        if "." in value:
                            value = float(value)
                        else:
                            value = int(value)
                    except ValueError:
                        pass
                    extension_data[tag].append(value)

    return extension_data

Any idea how to make this better? How to get easier the "name" of the extensions?

@tkrajina
Copy link
Owner

tkrajina commented Jun 13, 2018

Keep in mind that an extension can theoretically contain multiple extensions and each can be any kind of xml subtree, for example:

    <extensions>
      <ns3:Ext attr="bbb">
        <ns3:hr>125</ns3:hr>
        <ns3:hr>125</ns3:hr>
        <ns3:hr>125</ns3:hr>
        <ns3:hr>125</ns3:hr>
        <ns3:hr>125</ns3:hr>
        <ns3:cad><ns3:bbb>75</ns3:bbb></ns3:cad>
      </ns3:Ext>
    </extensions>

And now you need a simple and easy way to get the attr attribute, the hr values, and cad->bbb value.

@jedie
Copy link
Contributor Author

jedie commented Jun 25, 2018

Any idea how to make a simple to use API ?

@tkrajina
Copy link
Owner

tkrajina commented Jul 1, 2018

Well, no, not yet :) But, now that you asked, here are a couple of ideas:

Maybe something like:

points.extensions.get("TrackPointExtension", "hr") # returns a string
points.extensions.get_float("TrackPointExtension", "hr") # returns a number

Or, let's suppose there san be multiple hr tags:

points.extensions.get("TrackPointExtension", "hr[2]")

...and hr would just be an alias for hr[0].

Or maybe:

points.extensions.get("TrackPointExtension", "hr").string()
points.extensions.get("TrackPointExtension", "hr").number()
# in case of multiple "hr" elements, get the fourth one:
points.extensions.get("TrackPointExtension", "hr", 3).number()
# set a value:
points.extensions.get("TrackPointExtension", "hr").set(100)

@jedie
Copy link
Contributor Author

jedie commented Jul 2, 2018

points.extensions.get("TrackPointExtension", "hr[2]")

This looks ugly ;)

points.extensions.get("TrackPointExtension", "hr").string()
points.extensions.get("TrackPointExtension", "hr").number()
# in case of multiple "hr" elements, get the fourth one:
points.extensions.get("TrackPointExtension", "hr", 3).number()
# set a value:
points.extensions.get("TrackPointExtension", "hr").set(100)

This looks ok... Maybe "number" -> "float" ?!?

Because there can be multiple entries: get("TrackPointExtension", "hr") is a "shortcut" for: get("TrackPointExtension", "hr", 0) isn't it?

@tkrajina
Copy link
Owner

tkrajina commented Jul 2, 2018

Yes, I agree (including the "ugly" remark ;) ). Also, the API should allow for a way to retrieve attributes. Something like this:

points.extensions.get("ExtensionName", "tagName", "#attribute")

Or maybe:

points.extensions.getFloat("ExtensionName", "tagName", "#attribute")
points.extensions.getString("ExtensionName", "tagName", "#attribute")
points.extensions.get("ExtensionName", "tagName", "#attribute") #returns the DOM element

@jedie
Copy link
Contributor Author

jedie commented Jul 5, 2018

points.extensions.get_float("ExtensionName", "tagName", "#attribute")
points.extensions.get_string("ExtensionName", "tagName", "#attribute")
points.extensions.get("ExtensionName", "tagName", "#attribute") #returns the DOM element

;)

@pwolfram
Copy link

Just curious if there is a solution here-- gpx from Strava have hr, cadance, etc data as extensions like this:

    <extensions>
     <gpxtpx:TrackPointExtension>          
      <gpxtpx:hr>80</gpxtpx:hr>
      <gpxtpx:cad>0</gpxtpx:cad>
     </gpxtpx:TrackPointExtension>
    </extensions>

However, as far as I can tell, I don't think this data is getting brought into the extensions attribute of points. Given my understanding of the scope here this may be a bug. Any recommendation or advice on how to get the extensions data in practice is greatly appreciated.

@andyreagan
Copy link

Here's a working version from Strava, @pwolfram. It's not the prettiest, but it works.

import pandas as pd
import gpxpy
import lxml
from pathlib import Path


def df_from_segment(segment) -> pd.DataFrame:
    seg_list = []

    for point in segment.points:
        base_data = {
            'timestamp': point.time,
            'latitude': point.latitude,
            'longitude': point.longitude,
            'elevation': point.elevation,
            'speed': point.speed
        }
        extension_data = {
            lxml.etree.QName(child).localname: sloppy_float(child.text)
            for child in point.extensions[0]
        }
        for k, v in extension_data.items():
            base_data[k] = v
        seg_list.append(base_data)
    return pd.DataFrame(seg_list)


def df_from_track(track) -> pd.DataFrame:
    return pd.concat([df_from_segment(segment) for segment in track.segments])


def df_from_gpx(gpx):
    return pd.concat([df_from_track(track) for track in gpx.tracks])


gpxfile = gpxpy.parse(Path("stravafile.gpx").read_text())
gpxfile_df = df_from_gpx(gpxfile)

@astrowonk
Copy link

astrowonk commented Jul 25, 2021

@andyreagan @pwolfram Would love to knov if your strava files convert with hr properly with my gpxcsv converter (which while it makes csv, can also easily make a list of dicts for a dataframe.) It works well on the hr and other extension data in Apple Watch exported gpx files I have tried, but I haven't used strava. You'd just:

import pandas as pd
from gpxcsv import gpxtolist

df = pd.DataFrame(gpxtolist('myfile.gpx'))

@andyreagan
Copy link

@astrowonk confirmed, this works perfectly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants