This project purposed to assemble and synchronization of data that represent Magento entities with integrations, which are sources for the data consumed by external or SaaS services and integrations.
To collect and store the denormalized data required by the SaaS services and integrations, the export component utilizes native Magento indexers functionality under the hood. As a result, after the component installation, a merchant will notice new indexers in the corresponding admin menu, where each indexer represents a data feed.
The export component is a set Magento modules and requires Magento 2.4.4 and higher.
Contributions are welcomed! Read the Contributing Guide for more information.
This project is licensed under the OSL-3.0 License. See LICENSE for more information.
This extension allows to collect and export entity (called "feed") to consumer immediately after feed items have been collected.
Consumer must implement interface Magento\DataExporter\Model\ExportFeedInterface::export
(see default implementation in magento/saas-export)
Implementation of ExportFeedInterface::export
must return status of operation Magento\DataExporter\Model\FeedExportStatus
with response status code Magento\DataExporter\Status\ExportStatusCode
:
- Can be HTTP status code
--
200
- exported successfully --4xx
- client can't process request --5xx
- server side error - Or custom codes:
--
Magento\DataExporter\Status\ExportStatusCodeProvider::APPLICATION_ERROR
- something happened in side of Adobe Commerce configuration or processing --Magento\DataExporter\Status\ExportStatusCodeProvider::FAILED_ITEM_ERROR
- happens when some of the items in request were not processed successfully These codes will be saved in the "status" field of the feed table, to keep information about item status and resend items if they have "retryable" status (everything which is not 200 or 400 is retryable):
- collect entities during reindex or save action
- get entities that have to be deleted from feed (instead updating feed table with is_deleted=true)
- filter entities with identical hash (only if "export status NOT IN [Magento\DataExporter\Status\ExportStatusCodeProvider::NON_RETRYABLE_HTTP_STATUS_CODE])
- submit entities to consumer via
ExportFeedInterface::export
and return status of submitted entities - persist to feed table state of exported entities
- save record state status according to exporting result
- by cron check is any entities with status different from [200, 400] (
Magento\DataExporter\Status\ExportStatusCodeProvider::NON_RETRYABLE_HTTP_STATUS_CODE
) in the feed table - select entities with filter by modified_at && NOT IN status =
200
,400
- partial reindex
- Add new columns (required for immediate feed processing) to db_schema of the feed table:
<column xsi:type="smallint" name="status" nullable="false" default="0" comment="Status" /> <column xsi:type="varchar" name="feed_hash" nullable="false" length="64" comment="Feed Hash" /> <column xsi:type="text" name="errors" nullable="true" comment="Errors" />
- di.xml changes (in case if virtual type is created for the
FeedIndexMetadata
type. Otherwise - add these arguments to real class): -- Change theexportImmediately
value totrue
for metadata configuration:<argument name="exportImmediately" xsi:type="boolean">true</argument>
- There is also an option for debugging purposes to keep saving whole data to the feed table with argument
persistExportedFeed
set totrue
- Add
minimalPayload
argument with a minimal set of fields required by Feed Ingestion Service. Used to handle cases when feed item has been deleted. for example:<argument name="minimalPayload" xsi:type="array"> <item name="sku" xsi:type="string">sku</item> <item name="customerGroupCode" xsi:type="string">customerGroupCode</item> <item name="websiteCode" xsi:type="string">websiteCode</item> <item name="updatedAt" xsi:type="string">updatedAt</item> </argument>
- Add
feedIdentifierMapping
argument: describes the mapping between primary key columns in the feed table and corresponding fields in the feed item: for example:<argument name="feedIdentifierMapping" xsi:type="array"> <item name="product_id" xsi:type="string">productId</item> <item name="website_id" xsi:type="string">websiteId</item> <item name="customer_group_code" xsi:type="string">customerGroupCode</item> </argument>
- entitiesRemovable - this parameter handles feed configuration to cover cases when feed entities are not removable. Default value:
false
- feed entities can not be removed. For example: sales order
feed export's Sales Orders entities cannot be deleted andisRemovable
metadata parameter set to false.product
feed export's Products can be deleted andisRemovable
metadata parameter MUST be set to true, in other case - feed records wouldn't be marked as deleted in the event of entity removal.
The purpose of this mode is to speed up the export process by splitting the data into batches and processing them in parallel. The performance of data export should be aligned with the limit that is defined for a client at consumer side.
Configuration of this mode is done via System configuration (config.xml) per feed indexer:
<config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="urn:magento:module:Magento_Store:etc/config.xsd">
<default>
<commerce_data_export>
<feeds>
<products>
<thread_count>1</thread_count>
<batch_size>100</batch_size>
</products>
</feeds>
</commerce_data_export>
</default>
</config>
thread_count
- number of threads that will be used for processing (1 by default)batch_size
- number of items that will be processed in one batch (100 by default)
The multi-thread data export mode is applied for full and partial reindex.
It may be useful to change thread_count
and batch_size
in runtime when performing data export via CLI command. This can be done by passing the options --thread-count, --batch-size to the saas:resync command.
For example:
bin/magento indexer:reindex catalog_data_exporter_products --thread-count=5 --batch-size=400