Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Using filebeat software to get logs is not real time #734

Open
bbotte opened this issue Apr 10, 2024 · 7 comments
Open

[BUG] Using filebeat software to get logs is not real time #734

bbotte opened this issue Apr 10, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@bbotte
Copy link

bbotte commented Apr 10, 2024

Using filebeat software to get logs is not real time, sometimes by more than 10 minutes

i install siglens with docker,docker install

log Ingesting use filebeat filebeat

logs are generated all the time, but rarely are they near real time in the browser (less than 2 minutes), most are 5 or 10 minutes ago

@bbotte bbotte added the bug Something isn't working label Apr 10, 2024
@nkunal
Copy link
Contributor

nkunal commented Apr 10, 2024

Can you share your filebeat config?

@bbotte
Copy link
Author

bbotte commented Apr 11, 2024

this is filebeat config,

$ cat /data/soft/filebeat/filebeat.yml 
filebeat.inputs:
  - type: log
    id: backend-api
    enabled: true
    paths:
      - /data/logs/backend/api.log
    tags: ["pd_java_log","backend"]
    processors:
      - add_locale:
          format: abbreviation
          timezone: UTC
      - dissect:
          tokenizer: "%{datetime} [%{thread}] %{logLevel} %{logger} %{pid} - log:[contentType:%{contentType}][request:%{request}][method:%{method}][status:%{status}][body:%{body}][response-contentType:%{response-type}][response:%{response}][duration:%{duration}]"
          field: "message"
      - timestamp:
          field: dissect.datetime
          timezone: UTC
          layouts:
            - '2006-01-02 15:04:05.999'
          test:
            - '2019-06-22 16:33:51.123'
      - drop_event:
          when:
            not:
              has_fields: ['dissect.body']
      - drop_fields:
          fields: ["message","dissect.thread","dissect.logger","dissect.response-type","dissect.pid"]

output.elasticsearch:
  hosts: ['http://172.21.0.247:8081/elastic/']
  index: 'filebeat-ind-0'

setup.template.enabled: false
setup.ilm.enabled: false

logs:
2024-04-11 10:15:10.729 [http-nio-8889-exec-27] INFO com.pxxedu.book.web.filter.LogFilter 177 - log:[contentType:application/json][request:/actuator/health][method:POST][status:200][body:{"progressTime":110,"progress":10,"resourceId":3620,"studentId":137150,"replyId":5649963,"reportId":1712801590418,"openId":"","appType":"1"}][response-contentType:application/json][response:{"msg":"请求成功","code":10000,"data":{"replayId":5649963},"success":true,"serverTime":1712801710830}][duration:15ms]

OS:centos 7.9

give it a try and see how it goes

@sunitakawane
Copy link
Collaborator

sunitakawane commented Apr 11, 2024

I deleted data folder under filebeat dir before running this. Please try doing that.
I tried using the sample log you provided and the following config file and I was able to see the logline on siglens UI in real time.

filebeat.inputs:
  - type: log
    id: backend-api
    enabled: true
    paths:
      - /data/logs/backend/api.log
    tags: ["pd_java_log","backend"]
    
    processors:
      - add_locale:
          format: abbreviation
          timezone: UTC
      - dissect:
          tokenizer: "%{datetime} [%{thread}] %{logLevel} %{logger} %{pid} - log:[contentType:%{contentType}][request:%{request}][method:%{method}][status:%{status}][body:%{body}][response-contentType:%{response-type}][response:%{response}][duration:%{duration}]"
          field: "message"
      - timestamp:
          field: dissect.datetime
          timezone: UTC
          layouts:
            - '2006-01-02 15:04:05.999'
          test:
            - '2019-06-22 16:33:51.123'

      - drop_event: # Drop events missing first_name
          when:
            not:
              has_fields: ['dissect.body']
      - drop_fields:
          fields: ["message","dissect.thread","dissect.logger","dissect.response-type","dissect.pid"]

output.elasticsearch:
hosts: ['http://172.21.0.247:8081/elastic/']
index: 'filebeat-ind-0'

setup.template.enabled: false
setup.ilm.enabled: false

@bbotte
Copy link
Author

bbotte commented Apr 12, 2024

The configuration does not change, delete all files under /var/lib/filebeat/, still the same delay, need to test a longer time, such as half an hour

@bbotte
Copy link
Author

bbotte commented Apr 16, 2024

I deleted data folder under filebeat dir before running this. Please try doing that. I tried using the sample log you provided and the following config file and I was able to see the logline on siglens UI in real time.

filebeat.inputs:
  - type: log
    id: backend-api
    enabled: true
    paths:
      - /data/logs/backend/api.log
    tags: ["pd_java_log","backend"]
    
    processors:
      - add_locale:
          format: abbreviation
          timezone: UTC
      - dissect:
          tokenizer: "%{datetime} [%{thread}] %{logLevel} %{logger} %{pid} - log:[contentType:%{contentType}][request:%{request}][method:%{method}][status:%{status}][body:%{body}][response-contentType:%{response-type}][response:%{response}][duration:%{duration}]"
          field: "message"
      - timestamp:
          field: dissect.datetime
          timezone: UTC
          layouts:
            - '2006-01-02 15:04:05.999'
          test:
            - '2019-06-22 16:33:51.123'

      - drop_event: # Drop events missing first_name
          when:
            not:
              has_fields: ['dissect.body']
      - drop_fields:
          fields: ["message","dissect.thread","dissect.logger","dissect.response-type","dissect.pid"]

output.elasticsearch:
hosts: ['http://172.21.0.247:8081/elastic/']
index: 'filebeat-ind-0'

setup.template.enabled: false
setup.ilm.enabled: false

did you test it? It should be reproducible

@sunitakawane
Copy link
Collaborator

sunitakawane commented Apr 16, 2024

Is the log file /data/logs/backend/api.log getting updated continuously? Why do you suggest I should test this for half an hour? With one line of logdata you provided, I could see that line in siglens UI without any delay. I also tested this with another sample log file(siglens.log) and could see the logs in UI.

filebeat.inputs:
  - type: log
    id: backend-api
    enabled: true
    paths:
      - /Users/github/siglens/logs/siglens.log
    tags: ["pd_java_log","backend"]
    
    processors:
      - add_locale:
          format: abbreviation
          timezone: UTC
      - dissect:
          tokenizer: "%{logLevel}[%{datetime}]"
          field: "message"
      - timestamp:
          field: dissect.datetime
          timezone: UTC
          layouts:
            - '2006-01-02 15:04:05.999'
          test:
            - '2019-06-22 16:33:51.123'

      
output.elasticsearch:
  hosts: ['http://localhost:8081/elastic/']
  index: 'filebeat-ind-0'

setup.template.enabled: false
setup.ilm.enabled: false

@sekureco42
Copy link

I also noticed that the GUI has a delay of approximatly 2 minutes for displaying ingested data. I don't know exactly where this delay is coming from. Could it be some kind of caching of the front end? I also noticed that the delay could be shorter - but typically I have also this 2 minutes offset.

I'm using Vector.dev for log ingesting using the Elasticsearch API endpoint to ingest data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants