Skip to content

magento/commerce-data-export

Repository files navigation

Overview

This project purposed to assemble and synchronization of data that represent Magento entities with integrations, which are sources for the data consumed by external or SaaS services and integrations.

To collect and store the denormalized data required by the SaaS services and integrations, the export component utilizes native Magento indexers functionality under the hood. As a result, after the component installation, a merchant will notice new indexers in the corresponding admin menu, where each indexer represents a data feed.

Requirements and Dependencies

The export component is a set Magento modules and requires Magento 2.4.4 and higher.

Contributing

Contributions are welcomed! Read the Contributing Guide for more information.

Licensing

This project is licensed under the OSL-3.0 License. See LICENSE for more information.

Export process

This extension allows to collect and export entity (called "feed") to consumer immediately after feed items have been collected. Consumer must implement interface Magento\DataExporter\Model\ExportFeedInterface::export (see default implementation in magento/saas-export)

Implementation of ExportFeedInterface::export must return status of operation Magento\DataExporter\Model\FeedExportStatus with response status code Magento\DataExporter\Status\ExportStatusCode:

  • Can be HTTP status code -- 200 - exported successfully -- 4xx - client can't process request -- 5xx - server side error
  • Or custom codes: -- Magento\DataExporter\Status\ExportStatusCodeProvider::APPLICATION_ERROR - something happened in side of Adobe Commerce configuration or processing -- Magento\DataExporter\Status\ExportStatusCodeProvider::FAILED_ITEM_ERROR - happens when some of the items in request were not processed successfully These codes will be saved in the "status" field of the feed table, to keep information about item status and resend items if they have "retryable" status (everything which is not 200 or 400 is retryable):
    public const NON_RETRYABLE_HTTP_STATUS_CODE = [200, 400];

Immediate export flow:

  • collect entities during reindex or save action
  • get entities that have to be deleted from feed (instead updating feed table with is_deleted=true)
  • filter entities with identical hash (only if "export status NOT IN [Magento\DataExporter\Status\ExportStatusCodeProvider::NON_RETRYABLE_HTTP_STATUS_CODE])
  • submit entities to consumer via ExportFeedInterface::export and return status of submitted entities
  • persist to feed table state of exported entities
  • save record state status according to exporting result

Retry Logic for failed entities (only server error code):

  • by cron check is any entities with status different from [200, 400] (Magento\DataExporter\Status\ExportStatusCodeProvider::NON_RETRYABLE_HTTP_STATUS_CODE) in the feed table
  • select entities with filter by modified_at && NOT IN status = 200, 400
  • partial reindex

Migration to immediate export approach:

  • Add new columns (required for immediate feed processing) to db_schema of the feed table:
         <column
             xsi:type="smallint"
             name="status"
             nullable="false"
             default="0"
             comment="Status"
         />
         <column
             xsi:type="varchar"
             name="feed_hash"
             nullable="false"
             length="64"
             comment="Feed Hash"
         />
         <column
             xsi:type="text"
             name="errors"
             nullable="true"
             comment="Errors"
         />
  • di.xml changes (in case if virtual type is created for the FeedIndexMetadata type. Otherwise - add these arguments to real class): -- Change the exportImmediately value to true for metadata configuration:
         <argument name="exportImmediately" xsi:type="boolean">true</argument>
  • There is also an option for debugging purposes to keep saving whole data to the feed table with argument persistExportedFeed set to true
  • Add minimalPayload argument with a minimal set of fields required by Feed Ingestion Service. Used to handle cases when feed item has been deleted. for example:
    <argument name="minimalPayload" xsi:type="array">
        <item name="sku" xsi:type="string">sku</item>
        <item name="customerGroupCode" xsi:type="string">customerGroupCode</item>
        <item name="websiteCode" xsi:type="string">websiteCode</item>
        <item name="updatedAt" xsi:type="string">updatedAt</item>
    </argument>
  • Add feedIdentifierMapping argument: describes the mapping between primary key columns in the feed table and corresponding fields in the feed item: for example:
    <argument name="feedIdentifierMapping" xsi:type="array">
        <item name="product_id" xsi:type="string">productId</item>
        <item name="website_id" xsi:type="string">websiteId</item>
        <item name="customer_group_code" xsi:type="string">customerGroupCode</item>
    </argument>
    

Feed Index Metadata additional parameters:

  • entitiesRemovable - this parameter handles feed configuration to cover cases when feed entities are not removable. Default value: false - feed entities can not be removed. For example:
  • sales order feed export's Sales Orders entities cannot be deleted and isRemovable metadata parameter set to false.
  • product feed export's Products can be deleted and isRemovable metadata parameter MUST be set to true, in other case - feed records wouldn't be marked as deleted in the event of entity removal.

Multi-thread data export mode:

The purpose of this mode is to speed up the export process by splitting the data into batches and processing them in parallel. The performance of data export should be aligned with the limit that is defined for a client at consumer side.

Configuration of this mode is done via System configuration (config.xml) per feed indexer:

<config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="urn:magento:module:Magento_Store:etc/config.xsd">
    <default>
        <commerce_data_export>
            <feeds>
                <products>
                    <thread_count>1</thread_count>
                    <batch_size>100</batch_size>
                </products>
            </feeds>
        </commerce_data_export>
    </default>
</config>
  • thread_count - number of threads that will be used for processing (1 by default)
  • batch_size - number of items that will be processed in one batch (100 by default)

The multi-thread data export mode is applied for full and partial reindex.

It may be useful to change thread_count and batch_size in runtime when performing data export via CLI command. This can be done by passing the options --thread-count, --batch-size to the saas:resync command.

For example:

bin/magento indexer:reindex catalog_data_exporter_products --thread-count=5 --batch-size=400

About

No description, website, or topics provided.

Resources

License

OSL-3.0, Unknown licenses found

Licenses found

OSL-3.0
LICENSE.md
Unknown
COPYING.txt

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages