Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add/LatinFinance Migrator pt2 #298

Open
wants to merge 30 commits into
base: trunk
Choose a base branch
from
Open

Add/LatinFinance Migrator pt2 #298

wants to merge 30 commits into from

Conversation

iuravic
Copy link
Collaborator

@iuravic iuravic commented May 16, 2023

This updated wp cli command for setting primary categories will now
accept arguments for posts-per-batch and batches.

Posts per batch is essentially WP_Query::posts_per_page. Setting a
low number here will save sql query memory space.

Batches arugment will set the number of WP_Query(s) that will run.
The default of -1 will continuously run queries until all posts
have been updated with Yoast primary category.

This command can be run multiple times as it will only process new
rows that have yet to be processed.
@ronchambers
Copy link
Collaborator

Commit 6eee35a now allows arguments to be passed to wp newspack-content-migrator latinfinance-set-primary-categories

Help:

lf-wp-cli-set-primary-categories-help

To run on staging where memory may be limited, try this command:

wp newspack-content-migrator latinfinance-set-primary-categories --posts-per-batch=100 --batches=1

This will limit each WP_Query to 100 (default) posts per query, and the query will only run once (--batches=1). Once an adequate posts-per-batch value has been found that doesn't overload the memory, then feel free to set batches=-1 to continuously run the queries until all posts have been processed.

CoAuthors will be added to posts that contain the postmeta value
'newspack_lf_authors' and do not already have CoAuthors set.

This command can be run multiple times as it will only process new
rows that have yet to be processed.

Command:
wp newspack-content-migrator latinfinance-set-coauthors-plus

Arugments:

--posts-per-batch is essentially WP_Query::posts_per_page. Setting a
low number here will save sql query memory space.

--batches will set the number of WP_Query(s) that will run.
The default of -1 will continuously run queries until all posts
have been updated.
@ronchambers
Copy link
Collaborator

Commit 3e107b6 will set CoAuthors for each post based on a postmeta value.

lf-wp-cli-set-coauthors-help

To run on staging where memory may be limited, try this command:

wp newspack-content-migrator latinfinance-set-coauthors-plus --posts-per-batch=100 --batches=1

This will limit each WP_Query to 100 (default) posts per query, and the query will only run once (--batches=1). Once an adequate posts-per-batch value has been found that doesn't overload the memory, then feel free to set batches=-1 to continuously run the queries until all posts have been processed.

Code will check all postmeta 'newspack_lf_url' url values using
wp function url_to_postid to see if a correct post was found else
we need to add a redirect for the Redirection plugin.  Other checks
are done too.

WP CLI command:
wp newspack-content-migrator latinfinance-check-redirects

Output:
A report will be printed to the scren as the checks are run.  And
two CSVs will be exported to WP_CONTENT_DIR path that need to be
upload to Redirection plugin.
File 1) Existing redirects from the old site.
File 2) New redirects that are needed for WordPress.
@ronchambers
Copy link
Collaborator

Commit 3c550fd will export the needed Redirection plugin redirects.

** THIS WAS PERFORMED ON RON'S LOCAL MACHINE USING THE LATEST STAGING BACKUP. DO NOT RE-RUN. RON WILL UPLOAD THE OUTPUT CSVS INTO THE STAGING REDIRECTION PLUGIN. NO FURTHER ACTION NEEDED **

Code will check all postmeta newspack_lf_url url values using wp function url_to_postid to see if a correct post was found else
we need to add a redirect for the Redirection plugin. Other checks are done too.

WP CLI command:

wp newspack-content-migrator latinfinance-check-redirects

Output:

A report will be printed to the scren as the checks are run. And two CSVs will be exported to WP_CONTENT_DIR path that need to be upload to Redirection plugin.
File 1) Existing redirects from the old site.
File 2) New redirects that are needed for WordPress.

Command will export a CSV file that needs to be reviewed by hand
and adjusted prior to being imported into Redirection plugin. Adjustment
details are still being finalized. No CSV upload at this point.

Command:
wp newspack-content-migrator latinfinance-check-redirects-magazine-issues

Output:
CSV in WP_CONTENT_DIR
@ronchambers
Copy link
Collaborator

ronchambers commented May 23, 2023

Commit 8bd9a29 will export a CSV with Magazine Issue redirects for review. No data will be uploaded to Redirections plugin at this time.

** Discussion on Asana **

** Do not run command at this commit. Use the next commit instead **

Command will export a CSV file that needs to be reviewed by hand and adjusted prior to being imported into Redirection plugin. Adjustment details are still being finalized. No CSV upload at this point.

Command:
wp newspack-content-migrator latinfinance-check-redirects-magazine-issues

Output:
CSV in WP_CONTENT_DIR

Fixes previous commit related to wp cli command:
wp newspack-content-migrator latinfinance-check-redirects-magazine-issues

This new fix will properly export a CSV file for upload into Redirctions
plugin. The redirects will capture old website /magazine/[issues] urls
and redirect them to new /magazine/year/monthnum/ urls.
@ronchambers
Copy link
Collaborator

New commit eb5e1ff fixes previous commit to properly run the Magazine Issues url redirects.

This new fix will properly export a CSV file for upload into Redirctions plugin. The redirects will capture old website /magazine/[issues] urls and redirect them to new /magazine/year/monthnum/ urls.

CSV file can be reviewed by hand and adjusted prior to being imported into Redirection plugin.

** CSV File has been uploaded to Staging by Ron **

Command:
wp newspack-content-migrator latinfinance-check-redirects-magazine-issues

Output:
CSV in WP_CONTENT_DIR

Old site magazine redirects have been added, but internal links were
not. This will redirect WP links like:

'/magazine/yyyy/mon/day/read-digital-edition' to offsite digital PDF viewer.

Command:
wp newspack-content-migrator latinfinance-check-redirects-digital-editions

Output:
Redirects CSV to upload.
@ronchambers
Copy link
Collaborator

Commit 6d4c537 exports CSV for internal Read Digtial Edition links.

Old site magazine redirects have been added, but internal links were not. This will redirect WP links like: '/magazine/yyyy/mon/day/read-digital-edition' to offsite digital PDF viewer.

Command:
wp newspack-content-migrator latinfinance-check-redirects-digital-editions

Output:
Redirects CSV to upload.

** CSV uploaded by Ron **

Some posts were inserted twice via the wordpress importer. The importer
did not create 2 posts, but instead it added the 2nd post's postmeta
to the first post that already existed. This command will delete
all duplicated postmeta where the post_id, meta_key, and meta_value
all match.

Command:
wp newspack-content-migrator latinfinance-delete-duplicate-meta

Arguments:
--query-limit=integer
(use this to limit large sql queries).
@ronchambers
Copy link
Collaborator

ronchambers commented May 24, 2023

Commit 6d5e2cb will delete duplicate meta.

Some posts were inserted twice via the wordpress importer. The importer did not create 2 posts, but instead it added the 2nd post's postmeta to the first post that already existed. This command will delete all duplicated postmeta where the post_id, meta_key, and meta_value all match. It is safe to run command multiple times.

Command:
wp newspack-content-migrator latinfinance-delete-duplicate-meta

Arguments:
--query-limit=integer (use this to limit large sql queries) (default 1000 rows)

Output:
A report CSV will be output to WP_CONTENT_DIR.

image

The importer "merged" duplicate posts from the old system into single
posts. It will be easier to just delete the single posts, then re-
import the original posts later. Each post that is "deleted" accounts
for 2 articles from the old client website.

Command:
wp newspack-content-migrator latinfinance-delete-error-imports

Output:
CSV report is output. Nothing is actually deleted. There is a link
to each post in the CSV that can be used to delete each post by
hand.
@ronchambers
Copy link
Collaborator

Commit d806d6c adds a cli commend to "delete" error posts.

The importer "merged" duplicate posts from the old system into single posts. It will be easier to just delete the single posts, then re-import the original posts later. Each post that is "deleted" accounts for 2 articles from the old client website.

Command:
wp newspack-content-migrator latinfinance-delete-error-imports

Output:
CSV report is output. Nothing is actually deleted. There is a link to each post in the CSV that can be used to delete each post by
hand.

** Ron performed the deletions by hand on staging **

Resets duplicate categories that wp importer merged as one. The categories
are children "Hydro", "Solar", "Wind" under both "Energy" and "ESG.

This code uses the postmeta 'newspack_lf_categories' to reset
the categories to the proper posts.

Command:
wp newspack-content-migrator latinfinance-fix-duplicate-categories

Output:
Screen report of removals and additions. If errors, "by hand" fixes
needed will be shown.
@ronchambers
Copy link
Collaborator

Commit c9857d1 fixes merged categories.

Command will reset duplicate categories that wp importer merged as one. The categories are children "Hydro", "Solar", "Wind" under both "Energy" and "ESG. This code uses the postmeta 'newspack_lf_categories' to reset the categories to the proper posts.

Command:
wp newspack-content-migrator latinfinance-fix-duplicate-categories

Output:
Screen report of removals and additions. If errors, "by hand" fixes needed will be shown.

The wordpress importer had trouble replacing post_content image urls
when the urls contained "&" characters. This was fixed with an update
to the newspack-cms-importers WXR creator, but there were some images
that still needed updating in the post_content.

Command:
wp newspack-content-migrator latinfinance-fix-images-in-content

Ouput:
Screen report of DB fixes, along with todo by-hand fixes needed.
@ronchambers
Copy link
Collaborator

Commit 557d0e9 will fix Images in post_content due to URLs with "&" character.

The WordPress importer had trouble replacing post_content image urls when the urls contained "&" characters. This was fixed with an update to the newspack-cms-importers WXR creator, but there were some images that still needed updating in the post_content.

Command:
wp newspack-content-migrator latinfinance-fix-images-in-content

Ouput:
Screen report of DB fixes, along with todo by-hand fixes needed.

Missing old_assests images will now be removed from the post_content.

Command:
wp newspack-content-migrator latinfinance-fix-images-in-content

Ouput:
Screen report of DB fixes, along with todo by-hand fixes needed.
@ronchambers
Copy link
Collaborator

Commit 3c34186 fixes previous commit to remove old_assets images in content.

Missing old_assets images will now be removed from the post_content.

Command:
wp newspack-content-migrator latinfinance-fix-images-in-content

Ouput:
Screen report of DB fixes, along with todo by-hand fixes needed.

Added previous checksum checking and log to output files for by-hand
review.
Checks for new client categories, dupliate slugs, and post_exists
Added a secondary command to setting coauthors plus based on diff
between initial load and launch data.
Small regex fix and more logging
@ronchambers
Copy link
Collaborator

Commit 0f3c4a1 is the final commit to export the Newsletters / Mailchimp WXRs.

@ronchambers
Copy link
Collaborator

Commit eb1ca86 adds cli for LatinFinance Awards CAP-GA.

Command: wp newspack-content-migrator latinfinance-fix-awards-bylines --cats=

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants