Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skipping values when writing to DB #576

Closed
jonathan-dev opened this issue May 17, 2023 · 1 comment
Closed

Skipping values when writing to DB #576

jonathan-dev opened this issue May 17, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@jonathan-dev
Copy link

Specifications

  • Client Version: v1.36.1
  • InfluxDB Version: v2.3.0+SNAPSHOT.090f681737
  • Platform: Ubuntu 22.04.2 LTS

Code sample to reproduce problem

from influxdb_client import InfluxDBClient, Point
from datetime import datetime

import os
import time
import logging

port = 8086
server = "influx"
org = "xxx"

client = InfluxDBClient(url=f"http://{server}:{port}", token=os.environ.get('INFLUXDB_TOKEN'), org=org)
start_connecting = time.time() # in seconds
while(client.health().status != 'pass'):
    print("+++ waiting for influx +++")
    time.sleep(20)
    if(time.time()-start_connecting > 240 ):
        print("+++ timeout waiting for influx exiting... +++")
        exit(1);
print("+++ influx available +++")

write_api = client.write_api( error_callback=lambda tuple,str,e: print("error_callback",tuple,str,e),success_callback=lambda tuple,str: print("sucess_callback",tuple,str))
# write_api = client.write_api()


logging.getLogger('influxdb_client.client.write_api').setLevel(logging.DEBUG)

for i in range(2000):
    time.sleep(1/1000)
    t = datetime.now()
    print(i)
    write_api.write("sfs", "sca", Point("PulseRate")
    .tag("deviceId", "j0112233-4455-6677-8899-aabbccddeeff")
    .field("PulseValue", float(i))
    .time(t))

time.sleep(1) # sleep to let background thread write data

Expected behavior

I am writing 2000 samples in a loop to the db.

Actual behavior

When querying the db using |> count() I am always missing a few samples ( I get somethin like 1996 instead of 2000). In my investigation i noticed that the missing samples are missing wright after the first batch is sent so for example the first batch is ending with 876 and the second batch starts with 880 (i got the numbers from the success callback.

Additional info

So I am guessing this is a thread related bug? (because batching is using threads)
It also maybe seems to be related to the issue #436 but I'm not sure though.

@jonathan-dev jonathan-dev added the bug Something isn't working label May 17, 2023
@powersj
Copy link
Contributor

powersj commented May 21, 2024

In InfluxDB, a metric value is considered unique if the metric name, tag set, and time are unique. In your case the metric name and tag set are always the same. That means the only remaining value is the time.

Calling datetime.now does not guarentee a unique value in every loop. It creates a timestamp down to the microsecond, not nanosecond. That means that you could potentially have two values with the same time, meaning duplicate date. This would explain why you see duplicate data.

I'm going to close this as this isn't an issue with the library.

@powersj powersj closed this as not planned Won't fix, can't repro, duplicate, stale May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants