Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docx 'remove' helper #1064

Open
moofoo opened this issue May 31, 2023 · 1 comment
Open

docx 'remove' helper #1064

moofoo opened this issue May 31, 2023 · 1 comment

Comments

@moofoo
Copy link

moofoo commented May 31, 2023

Hello!

I'm interested in implementing a 'remove' built-in helper for the docx recipe, similar to Carbone's drop formatter.
A typical use of this tag in carbone's templating language looks like:

{d.text:ifEM:drop(p)}

which means "if data property 'text' is empty, drop the paragraph"

Besides paragraphs the tag can also remove rows, tables, images, charts, and shapes.

Personally, I find removing paragraphs, rows and tables to have the most utility. Rows and tables, in particular, since you can group template content into border-less tables and conditionally drop those to remove chunks of content. This results in much cleaner templates, versus handling content variations with {{#if ...}}...{{/if}} conditional helpers.

The Carbone implementation doesn't have the ability to drop sections or whole pages, which I think would be very useful, but that's getting ahead of myself. To start, I'd just like to implement dropping paragraph, row and table elements.

I'm assuming this functionality isn't possible through user-space handlebar helpers (if that's not the case, let me know!)

Assuming handlebars is the engine, a JsReport implementation of this could look like:

{{docxRm value=truthyValue tag="p"}}

it would also be useful if value could be determined by a helper rather than a data property, but I'm not sure how the syntax for that should look

I'm assuming the basic algorithm is:

  1. get parent node
  2. if parent node type === tag, remove node (and all children)
  3. else, go to 1

There's probably more to it than that, of course. I'm guessing I'd use the utility functions getClosestEl and clearEl for this.

I've only just started looking into the JsReport code to figure out how to do this, but right now I'm assuming I would do the following:

  1. add a preprocess module 'remove.js':

packages/jsreport-docx/lib/preprocess/remove.js

const { nodeListToArray } = require('../utils')

const regexp = /{{#docxRm [^{}]{0,500}}}/

module.exports = (files) => {
  for (const f of files.filter(f => f.path.endsWith('.xml'))) {
    const doc = f.doc

    const docxRmElements = nodeListToArray(doc.getElementsByTagName('w:t')).filter((tEl) => {
      return tEl.textContent.includes('{{docxRm'))
    })
    
    /* do stuff. NEED ADVICE ON THIS
    - need to handle situation with nested docxRm tags, That is, the items in docxRmElements can change as nodes are 
      removed, which needs to be accounted for
   - Not sure how to go about getting 'tag' and 'value' from the node

*/

        
}
  1. import and call it in preprocess.js. Assuming it would need to be called first (before bookmark(...)):
// other imports...
const remove = require('./remove')

module.exports = (files) => {
  concatTags(files)
  context(files)

  const sectionsDetails = sections(files)

  const headerFooterRefs = sectionsDetails.reduce((acu, section) => {
    if (section.headerFooterReferences) {
      acu.push(...section.headerFooterReferences)
    }

    return acu
  }, [])
  remove(files);
  bookmark(files, headerFooterRefs)
 // rest of code...
  1. add a docxRm function to packages/jsreport-docx/static/helpers.js, something like:
function docxRm (options) {
  const Handlebars = require('handlebars')

if(options.hash.tag == null){
    throw new Error('docxRm helper requires tag parameter')
}
  options.hash.value = options.hash.value === 'true' || options.hash.value === true

  return new Handlebars.SafeString('$docxRm' + Buffer.from(JSON.stringify(options.hash)).toString('base64') + '$')
}

Any guidance or insight would be appreciated. I'm working on this in my free time so there's no rush.

@moofoo moofoo changed the title docx 'remove' handler docx 'remove' helper Jun 2, 2023
@bjrmatos
Copy link
Collaborator

bjrmatos commented Jun 8, 2023

hi, sorry for the delay, we were busy releasing jsreport 3.12.0

Assuming handlebars is the engine, a JsReport implementation of this could look like:

it looks good to me, though i would call it element instead of tag, the exact element that you pass will be w:p for paragraphs

it would also be useful if value could be determined by a helper rather than a data property, but I'm not sure how the syntax for that should look

exactly, that is what is going to make it more useful, it is called subexpressions
{{docxRm value=(helperCall param) tag="p"}}

I'm assuming the basic algorithm is:
get parent node
if parent node type === tag, remove node (and all children)
else, go to 1
There's probably more to it than that, of course. I'm guessing I'd use the utility functions getClosestEl and clearEl for this.

so far in general, it looks like you got it right, so go ahead with that implementation. the helpers you mention should allow you to do it more easily

  1. add a preprocess module 'remove.js':

yes, you need to add a new preprocess module there, for the logic try to come up with something, at least a simple logic, the nested docxRm calls sounds a bit complex honestly, perhaps it helps if you explain how the nested docxRm calls are useful.

generally i think that in this step you need to mark the xml tag that should be removed, then in a postprocess step you should find those marked elements and remove them with string replace (just like we do with other cases, check some of the postprocess modules like postprocess/sections.js to get an idea how to search for an element and replace it later using regexp.

  1. import and call it in preprocess.js. Assuming it would need to be called first (before bookmark(...)):

the exact place will depend on other things, usually the transformations (concatTags, bookmark) are implemented with specific order in mind, removing elements is kind a big thing, because you can break other features by removing elements that are expected to be there when executing the transformations, i would say you need to call this remove transform at the very last.

  1. add a docxRm function to packages/jsreport-docx/static/helpers.js, something like:

yes, this sounds good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants