Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HOW TO: FUNCTION to find entity_id based on friendly name, no IDs in prompt! #193

Open
rfam13 opened this issue Apr 5, 2024 · 7 comments

Comments

@rfam13
Copy link

rfam13 commented Apr 5, 2024

NOTICE: you must have https://github.com/AlexxIT/PythonScriptsPro installed for this to work correctly.

EDIT: I'm assuming you need to change the allowed number of functions from 1, I have mine at 5 but have not tested with it set to 1.

Model: gpt-3.5-turbo

This uses python scripts pro (allows for inline python) to search your entities using Python for a matching name of an entity, and then returns the entity_id as a response to the function. This eliminates the need for sending all entities in the initial prompt allowing the context to be minimal.

Make sure you add to your prompt, to use the functions response as the entity_id for the action requested.

This is a rough draft of the function chatgpt and I came up with. I have mine modified to source out lights mainly. If I make any advancements in the fuzzy search I will update.

SmartSelect_20240405_084610_Home Assistant.jpg

SmartSelect_20240405_084659_Home Assistant.jpg

You can also instruct it to find the entity_id first then proceed with the requested action, for example getting the entity_id and then getting the attributes

Screenshot_20240405_085953_Home Assistant.jpg

SmartSelect_20240405_090241_Home Assistant.jpg

- spec:
    name: find_entity_by_device_name
    description: Find an entity ID in Home Assistant by performing a fuzzy search on the entity's friendly name based on the given device name, with a focus on lighting-related entities.
    parameters:
      type: object
      properties:
        device_name:
          type: string
          description: The name of the device to search for. This will be used in a fuzzy search to find the closest matching entity, prioritizing lighting-related functions.
      required:
        - device_name
  function:
    type: composite
    sequence:
      - type: script
        sequence:
          - service: python_script.exec
            data:
              source: |
                device_name = "{{ device_name }}"
                matched = None
                best_score = 0

                # Relevant domains for lighting-related entities
                relevant_domains = ['light', 'switch']

                # Iterate through all entities to perform a fuzzy search based on the device name
                for entity in hass.states.all():
                    domain = entity.entity_id.split('.')[0]  # Extract domain from entity_id
                    if domain not in relevant_domains:
                        continue  # Skip entities not in the relevant domains

                    entity_name = entity.attributes.get('friendly_name', '').lower()

                    # Simple scoring based on character matches
                    score = sum(1 for char in device_name if char in entity_name)

                    # Increase score for entities with names suggesting lighting functionality, regardless of domain
                    if 'light' in entity_name or 'lamp' in entity_name or 'illumination' in entity_name:
                        score += 5  # Add a bonus for lighting-related terms in the entity name

                    if score > best_score:
                        matched = entity.entity_id
                        best_score = score

              
                if matched:
                    message = f"{matched}"
                else:
                    message = f"No matching entity found for device name '{device_name}'."

            response_variable: _function_result
@rfam13 rfam13 changed the title FUNCTION: Find entity_id based on name HOW TO: FUNCTION to find entity_id based on friendly name, no IDs in prompt! Apr 6, 2024
@jleinenbach
Copy link

Hmm, there are also area_ids - so the answer is not quite correct. But I love the fuzzy search.

@rfam13
Copy link
Author

rfam13 commented Apr 10, 2024

Hmm, there are also area_ids - so the answer is not quite correct. But I love the fuzzy search.

I do not use area_ids for anything, just the entity_id. How would you expect it to respond? Python definitely is not my best language but I can try and modify for area_ids.

@jleinenbach
Copy link

I think it's not a big deal with area_ids. We can just let ChatGPT to choose the best function.
Just create a function find_entity_by_area_name
You could use functions like area_entities('kitchen') or use your fuzzy search, but for area_names.

Why do you search your entities by the device name?
For me, it looks like you acutally search for the entity_name - but ChatGPT just doesn't know that and makes a good guess to use your function. And your functions searches for a given name, so it's not a problem at all.

@jleinenbach
Copy link

jleinenbach commented Apr 12, 2024

@rfam13
I've accepted the challenge and am pleased to likely have succeeded in enhancing your solution. I'm excited to present a new version that makes your great idea even better.

Configuration Requirements for the Extended OpenAI Conversation Fuzzy-Entity-Search-Function in Home Assistant

Prerequisites:

This function has been tested exclusively on the current development version of Home Assistant, which supports the use of labels and floors. Ensure your Home Assistant version is up to date with these features before proceeding.

Setting Up Python Script Pro:

To enhance the functionality of our function with Python Script Pro, you need to include the following lines in your configuration.yaml file. This setup is crucial as it prepares your environment to use additional Python packages that our function depends on.

python_script:  # no 'S' at the end!
  requirements:
    - rapidfuzz
    - fuzzywuzzy
    - Levenshtein

Note: While fuzzywuzzy has been a standard choice for fuzzy string matching, rapidfuzz is recommended as its alternative. Please be aware that the successor to fuzzywuzzy, known as thefuzz, is currently marked on GitHub as not compiling and being faulty.

Configuration of Entities:

When setting up the function, it is recommended to refer to entities as 'entities' and not as 'devices'. This helps ensure clarity in commands and configurations. Below is a segment you should include in your ChatGPT configuration prompt without modification, to ensure proper understanding and handling by ChatGPT:

'Labels' categorize entities, devices, or areas for easier organization and retrieval. Labels are not to be confused with physical locations:

label_id,label_name
{% for label_id in labels() -%}
{{ label_id }},{{ label_name(label_id) }}
{% endfor -%}

'Floors' represent physical sections within the home, uniquely identified by 'floor_id' or 'floor_name':

floor_id,floor_name
{% for floor_id in floors() -%}
{{ floor_id }},{{ floor_name(floor_id) }}
{% endfor -%}

'Areas' represent physical sections within the home, uniquely identified by 'area_id' or 'area_name':

area_id,area_name
{% for area_id in areas() -%}
{{ area_id }},{{ area_name(area_id) }}
{% endfor -%}

Search and verify correct entity IDs before responding to queries. Follow these guidelines:

  1. Search Automatically: Use system functions to identify and verify entity IDs.
  2. Verify: Ensure all IDs are from reliable sources and confirmed.
  3. No Assumptions: Do not guess or assume IDs without confirmation.

Example:
User: "When did it ring at the front door yesterday?"
You: "Searching for the verified entity ID of the doorbell and checking historical data."

These instructions ensure all responses are based on validated and accurate information.

Handling of Local Variables:

An important note for users: Python Script Pro passes all local variables via "response_variable: _function_result". This means that all entities might be unintentionally transmitted to ChatGPT if they are stored in local variables within the Python function. We have addressed this issue in the current version to avoid unintentional data exposure every time the function is used.

Documentation and Understanding Response Variables:

There is a lack of extensive documentation on response_variable and _function_result. It's crucial to understand that we need one response_variable per indentation level to access it at a higher level in the next step. When using "response_variable: _function_result", we pass everything the function returns, where _function_result acts like an object being delivered to standard output. This object is then packaged into a variable at a higher level, allowing us to access elements within it.

@jleinenbach
Copy link

jleinenbach commented Apr 12, 2024

Edit: 2024-04-25: New version: Added is_exposed filter with a workaround.

- spec:
    name: find_entity_ids_by_lookup
    description: >
      Returns entity IDs by conducting a fuzzy search based on the provided name. It returns a list of entity IDs along with their friendly names, aiding in pinpointing the most relevant entity based on the input. Optional parameters like area (either area_id or area_name), floor_name, label_name, and domain can refine the search.
      Attempt an initial narrow search using a guessed label. If unsuccessful, try again with a broader search approach.
    parameters:
      type: object
      properties:
        name:
          type: string
          description: Name or partial name of the entity if available. The search encompasses all entities, regardless of their domain.
        area_IDs_and_Names:
          type: array
          items:
            type: string
          description: One or more propable area names or IDs known within the system, relevant to the query.
        floor_ID_or_Name:
          type: string
          description: The name or ID of the floor. Floors correspond to levels of the building. Optional.
        Label_IDs_and_Names:
          type: array
          items:
            type: string
          description: One or more reliable Label names or IDs directly related to the search, refining results to specific functions or categories.
        domain:
          type: string
          description: The entity's domain, such as 'light', 'cover', 'climate', 'fan', 'media_player', 'switch', 'button', 'person', 'sensor', 'device_tracker'. Specifying the domain helps refine searches to specific types of entities and is preferred to narrow down the results effectively, though it is optional. If the search is specifically for a single entity named after a person, default to using 'person' as the domain, but when searching for multiple entities associated with a person, do not set the domain parameter to 'person' by default.
        limit:
          type: integer
          description: Maximum number of search results to return, adjusted based on search scope to ensure optimal coverage without overwhelming results. Defaults to 15.
      required:
        - Label_IDs_and_Names
        - limit
  function:
    type: composite
    sequence:
      - type: template
        value_template: >
          {# Macro to convert area names to area IDs #}
          {%- macro convert_to_area_ids(area_IDs_and_Names) -%}
            {%- set ns = namespace(area_IDs=[]) -%}
            {# Ensure input is a list even if a single string is provided #}
            {%- set area_IDs_and_Names = [area_IDs_and_Names] if area_IDs_and_Names is string else area_IDs_and_Names -%}
            {# Iterate over each item and process only if it's a valid area name or ID #}
            {%- for item in area_IDs_and_Names if area_id(item) or item in areas() -%}
              {# Append the area ID if item is a valid area name or confirm it's an area ID #}
              {%- if area_id(item) -%}
                {%- set ns.area_IDs = ns.area_IDs + [area_id(item)] -%}
              {%- else -%}
                {%- set ns.area_IDs = ns.area_IDs + [item] -%}
              {%- endif -%}
            {%- endfor -%}
            {# Convert the list of area IDs to a comma-separated string to ensure it's returned as a string #}
            {{- ns.area_IDs | join(',') -}}
          {%- endmacro -%}
          
          {# Initialize namespace variables #}
          {%- set ns = namespace(
            floor_ID=None,
            area_IDs=[],
            valid_Label_IDs=[],
            Label_IDs=[],
            filtered_entities=[],
            include=true
          ) -%}
          
          {# Set 'domain' to empty string if it is undefined #}
          {%- set domain = domain | default('', true) -%}
          
          {# Validate and convert floor_ID_or_Name to floor_ID #}
          {%- if floor_ID_or_Name is defined and floor_ID_or_Name -%}
            {%- if floor_id(floor_ID_or_Name) -%}
              {%- set ns.floor_ID = floor_id(floor_ID_or_Name) -%}
            {%- elif floor_ID_or_Name in floors() -%}
              {%- set ns.floor_ID = floor_ID_or_Name -%}
            {%- endif -%}
          {%- endif -%}
          
          {%- if area_IDs_and_Names is not defined -%}
            {%- set area_IDs_and_Names = [] -%}
          {%- endif -%}
          {# Convert area names to IDs #}
          {%- set area_IDs_and_Names = [area_IDs_and_Names] if area_IDs_and_Names is string else area_IDs_and_Names -%}
          {%- if area_IDs_and_Names -%}
            {%- set area_IDs_String = convert_to_area_ids(area_IDs_and_Names) -%}
            {%- set ns.area_IDs = area_IDs_String.split(',') -%}
          {%- else -%}
            {%- set ns.area_IDs = [] -%}
          {%- endif -%}
          
          {# Convert Label names to IDs #}
          {%- set Label_IDs_and_Names = [Label_IDs_and_Names] if Label_IDs_and_Names is string else Label_IDs_and_Names -%}
          {%- for label_identifier in Label_IDs_and_Names -%}
            {%- set current_label_id = label_id(label_identifier) -%}
            {%- if current_label_id -%}
              {%- set ns.valid_Label_IDs = ns.valid_Label_IDs + [current_label_id] -%}
            {%- elif label_identifier in labels() -%}
              {%- set ns.valid_Label_IDs = ns.valid_Label_IDs + [label_identifier] -%}
            {%- endif -%}
          {%- endfor -%}
          {%- if ns.valid_Label_IDs | length > 0 -%}
            {%- set ns.Label_IDs = ns.valid_Label_IDs -%}
          {%- endif -%}
          
          {# Main loop to filter entities based on the pre-set conditions #}
          {%- for state in states if not is_hidden_entity(state.entity_id) -%}
            {%- set ns.include = true -%}
            {%- set entity_area_id = area_id(state.entity_id) -%}
          
            {%- if ns.floor_ID and (not entity_area_id or entity_area_id not in floor_areas(ns.floor_ID)) -%}
              {%- set ns.include = false -%}
            {%- elif ns.area_IDs and ns.area_IDs != [''] and (not entity_area_id or entity_area_id not in ns.area_IDs) -%}
              {%- set ns.include = false -%}
            {%- elif domain and domain | trim != "" and state.domain != domain -%}
              {%- set ns.include = false -%}
            {%- elif ns.Label_IDs|length > 0 -%}
              {# Check if all specified labels are present #}
              {%- for label_id in ns.Label_IDs -%}
                {%- if label_id not in labels(state.entity_id) -%}
                  {%- set ns.include = false -%}
                {%- endif -%}
              {%- endfor -%}
            {%- endif -%}
          
            {# Add to filtered entities if all conditions are met #}
            {%- if ns.include -%}
              {%- set ns.filtered_entities = ns.filtered_entities + [state.entity_id] -%}
            {%- endif -%}
          {%- endfor -%}
          
          {# Output the filtered entities in JSON format #}
          {{- ns.filtered_entities | to_json -}}
        response_variable: filtered_entity_ids
      - type: sqlite
        query: >-
          {% set ns = namespace(exposed_entities = '') %}
          {% set entity_ids = (filtered_entity_ids | from_json) %}
          
          {% for entity_id in entity_ids %}
            {% if is_exposed(entity_id) %}
              {% set ns.exposed_entities = ns.exposed_entities + entity_id + ',' %}
            {% endif %}
          {% endfor %}
          {% if ns.exposed_entities %}
            SELECT '{{ ns.exposed_entities[:-1] }}' AS result;
          {% else %}
            SELECT '' AS result;
          {% endif %}
        response_variable: result
      - type: template
        value_template: >-
          {% if result is defined %}
            {% set json_result = result | tojson %}
            {% set result_value = (json_result | from_json)[0].result | default('', true) | safe %}
          {% else %}
            {% set result_value = '' %}
          {% endif %}
          
          {% if result_value == '' %}
            {{ [] | tojson }}
          {% elif result_value is string and (result_value.startswith('[') and result_value.endswith(']')) %}
            {{ result_value }}
          {% elif result_value is string %}
            {% set result_list = result_value.split(',') %}
            {{ result_list | tojson }}
          {% else %}
            {{ [] | tojson }}
          {% endif %}
        response_variable: filtered_entity_ids
      - type: script
        sequence:
          - service: python_script.exec
            data:
              # cache: false  # disable cache if you want change python file without reload HA
              source: |
                from fuzzywuzzy import process
                import json
                from homeassistant.components.homeassistant.exposed_entities import async_should_expose

                try:
                    entity_ids = json.loads('{{ filtered_entity_ids }}')
                except json.JSONDecodeError:
                    entity_ids = []  # Safe fallback

                try:
                    limit = int('{{ limit | default(30, true) }}')
                    if not 15 <= limit <= 80:
                        limit = 15 # Ensure limit is within a reasonable range
                except ValueError:
                    limit = 15  # Default

                entities = [hass.states.get(entity_id) for entity_id in entity_ids]
                entity_names = [entity.attributes.get('friendly_name', entity.entity_id.split('.')[1]) for entity in entities]
                name_to_id = {entity.attributes.get('friendly_name', entity.entity_id.split('.')[1]): entity.entity_id for entity in entities}

                search_name = "{{ name.replace('\"', '\\\"') if name is defined else '' }}"
                if search_name.strip() == "":
                    # If name is empty, return all entities without performing a fuzzy search
                    results_dict = {name: entity_id for name, entity_id in name_to_id.items()}
                else:
                    results = process.extract(search_name, entity_names, limit=limit)
                    # Creating a dictionary of results
                    results_dict = {name: name_to_id[name] for name, score in results if score > 50}

                # Outputting the dictionary directly to Home Assistants standardized output
                # for script results which will be captured in _function_result automatically
            response_variable: _function_result
        response_variable: variables
      - type: template
        value_template: >
          {%- macro get_floor_info(entity_id) -%}
            {%- set floor_id = floor_id(entity_id) -%}
            {%- if floor_id is not none -%}
              Floor ID: {{ floor_id }}; Floor Name: {{ floor_name(floor_id) }}
            {%- else -%}
              No floor associated
            {%- endif -%}
          {%- endmacro -%}

          {%- macro get_device_info(entity_id) -%}
            {%- set device_id = device_id(entity_id) -%}
            {%- if device_id is not none -%}
              Device ID: {{ device_id }};
              Device Name: {{ device_attr(device_id, 'name') }};
              Device Name by User: {{ device_attr(device_id, 'name_by_user') | default('None specified') }}
            {%- else -%}
              No device associated
            {%- endif -%}
          {%- endmacro -%}

          {%- if variables.results_dict and variables.results_dict|length > 0 -%}
            Give a simple direct answer with no details.
            Ensure responses strictly align with the user’s explicit query, focusing solely on entities that directly match the search criteria.
            1. Entity Verification: Before presenting results, verify that the entities indeed belong to the correct category, such as 'light', and are not incorrectly categorized.
            2. Contextual Presentation: Present entities in the context of the search request, clearly marking them to ensure correct identification and understanding.
            3. Error Correction and Re-search: If the initial results are incorrect or not satisfying, recommend conducting a new search with adjusted parameters to better meet the user’s needs.
            4. Ensure that responses incorporate the user’s language preferences, especially when referencing technical terms such as 'floor' or 'area'.

            By following these steps, you ensure that the information presented is both relevant and reliable, aiding in informed decision-making and enhancing user satisfaction.
            This is the search result:

            {% for name, entity_id in variables.results_dict.items() -%}
              {# Ensure that entity_id is valid and exists #}
              {%- if entity_id and states[entity_id] -%}
                Entity ID: {{ entity_id }}; Entity Name: {{ states[entity_id].name | default('Unknown name') }};
                {%- set device_class=state_attr(entity_id, 'device_class') -%}
                {%- if device_class -%}
                  Device Class: {{ state_attr(entity_id, 'device_class') }};
                {% endif -%}
                Area Name: {{ area_name(entity_id) if area_id(entity_id) else 'No area associated' }};
                {{ get_floor_info(entity_id) }};
                Label IDs: {{ labels(entity_id) | join(', ') | default('No labels') }};
                {{ get_device_info(entity_id) }}.
              {%- else -%}
                Entity with ID {{ entity_id }} does not exist or is invalid.
              {%- endif -%}
            {% endfor -%}
          {%- else -%}
            "No close matches found{{ ' for ' ~ name if name is defined else '' }}. Try adjusting your search term for better results."
          {%- endif -%}

@rfam13
Copy link
Author

rfam13 commented Apr 12, 2024

Awesome! I ended with the same libraries and a character count with reward and penalty based on length of friendly name and requested device name comparison. This helped eliminate the entities being found with 1 or 2 characters extra but matched true for all the other rules.

I will test yours out when I get a chance, yours may be a winner!

@jleinenbach
Copy link

jleinenbach commented Apr 23, 2024

I updated to a new version above. It now requires a Label, but it should work even if you remove this requirement.

And you may want to add this to your prompt:

Search and verify correct entity IDs before responding to queries. Follow these guidelines:

1. **Search Automatically:** Use system functions to identify and verify entity IDs.
2. **Verify:** Ensure all IDs are from reliable sources and confirmed.
3. **No Assumptions:** Do not guess or assume IDs without confirmation.

Example:
User: "When did it ring at the front door yesterday?"
You: "Searching for the verified entity ID of the doorbell and checking historical data."

These instructions ensure all responses are based on validated and accurate information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants