Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nueva lección: Del caos hacia el orden, gestionar fuentes primarias digitalizadas con Tropy #598

Open
jenniferisasi opened this issue Jan 23, 2024 · 13 comments

Comments

@jenniferisasi
Copy link
Contributor

Lesson to be received and published in FR, ES and PT simultaneously

Programming historian en español ha recibido el texto de la propuesta para "Del caos hacia el orden, gestionar fuentes primarias digitalizadas con Tropy" por parte de Douglas McRae @mcraed2004.

Objetivos de aprendizaje:

  • Organizar y anotar archivos de imágenes de recursos primaros como datos para la investigación de manera efectiva
  • Aprender sobre la importancia y el correcto uso de metadatos adaptados a tus recursos primaros y tu investigación
  • Manejar datos de investigación para ampliar tu futuro análisis y presentación

La editora se compromete a realizar una primera lectura de la lección en el plazo de dos semanas. Después, el autor contará con el plazo de otras dos semanas (fecha negociable) para realizar cambios necesarios.

Por el momento, el contacto principal de esta lección es @jenniferisasi (ES).

Si se produce algún problema, el autor puede contactar con nuestra ’ombudsperson' (Silvia Gutiérrez de la Torre - http://programminghistorian.org/es/equipo-de-proyecto). Contactar con la ombudperson no implica cambios en el proceso de publicación de la lección.

Anti-Harassment Policy

This is a statement of the Programming Historian's principles and sets expectations for the tone and style of all correspondence between reviewers, authors, editors, and contributors to our public forums.

The Programming Historian is dedicated to providing an open scholarly environment that offers community participants the freedom to thoroughly scrutinize ideas, to ask questions, make suggestions, or to requests for clarification, but also provides a harassment-free space for all contributors to the project, regardless of gender, gender identity and expression, sexual orientation, disability, physical appearance, body size, race, age or religion, or technical experience. We do not tolerate harassment or ad hominem attacks of community participants in any form. Participants violating these rules may be expelled from the community at the discretion of the editorial board. Thank you for helping us to create a safe space.

@charlottejmc
Copy link
Collaborator

Hola @jenniferisasi y @mcraed2004,

Puedes encontrar los archivos aquí:

Puedes revisar la vista previa de la lección aquí:

I also noticed a couple of small things when processing the files, listed below:

  • At Line 193, there seems to be placeholder text: [Imagen: Opciones de imprimir en Preferencias]. I wonder if you meant to add another image here?
  • I noticed that the dataset used in this lesson is the 'colección Sección Civiles-Esclavos'. We'd like to be able to host all our lessons' assets in a dedicated folder on our GitHub repository. As you can see from the links above, I have created assets/gestionar-fuentes-primarias-digitales-con-tropy. This folder is currently empty, because I actually cannot access the dataset myself without creating an account to log into the Red Historia Venezuela website. Can I leave it up to you to upload the assets into the folder?

¡Muchas gracias! ✨

@mcraed2004
Copy link
Collaborator

¡A la orden!

  1. Yes, I forgot to include an image here. I will place it in the images repository and temporarily call it del-caos-hacia-el-orden6b. The justification for this image is that it helps the user locate the parameters for what will visualize when they print/export.

  2. Yes, the Red Historia Venezuela now requires that users create a login. Previously, this intermediary step did not exist. I mention in the tutorial that to access it, the user will need to create a login. My question is this: what types of assets need to be uploaded? Like in the other tutorials, the repository consists of hundreds of high quality PDF scans, along with metadata and supplementary material on the site itself--do we just need to upload a few examples? Please let me know what is the minimum expected here to be uploaded to the assets folder.

@mcraed2004
Copy link
Collaborator

es-or-gestionar-fuentes-primarias-digitales-con-tropy-06b I don't have push access to the images folder, so I attach the missing image here. I changed the title to reflect the formatting of the other images.

@anisa-hawes
Copy link
Contributor

anisa-hawes commented Jan 31, 2024

Hola @mcraed2004,

I've double-checked and you do appear to have the necessary access as an Outside Collaborator, but I apologise if this isn't working from your side when you tried to upload to the /images directory 🤔 ... Don't worry, @charlottejmc and I can take care of adding in this missing image on your behalf. Could you email it to Charlotte directly (publishing.assistant[@]programminghistorian.org) or add this to Sofia's repo with the other materials so we can download it from there?

In answer to your question about the assets. Is the colección Sección Civiles-Esclavos open access? We can liaise with @jenniferisasi on this, but I'd suggest that there are a couple of considerations:

  • If the lesson can be worked through in a meaningful way using a small sample of files, I think it would be practical to make a selection from the dataset. We can save these into a .zip file and host them in our /assets directory for readers to download. We can also link to the larger dataset for readers who want to experiment further.

  • If is it necessary for the lesson that readers handle a much larger dataset, we can either host it as a .zip on Zenodo (where we have capacity to host larger files than is possible on GitHub) or we can provide a link for readers to download the dataset from the original source.

Let us know what you think.

Very best,
Anisa

@jenniferisasi
Copy link
Contributor Author

Thank you @anisa-hawes, I was waiting for your recommendations as much as the author.

Ideally, the best would be if we can get a few sample files in our repository for people to download and use in Tropy if they want to practice. But it's not bad at all to have something bigger in Zenodo either.

@anisa-hawes
Copy link
Contributor

Super. Perhaps we could do both.

@mcraed2004 are you able to advise on the scale (number of sample documents) that would be needed for readers to work through the lesson?

Do you like @jenniferisasi's suggestion of both selecting a small sample set of files for practice, and providing a larger sample set of files for further experimentation?

@spapastamkou
Copy link
Contributor

(I gave write access to ph-submissions to the author after he posted the comment and before Anisa double-checks, if this helps:)).

@jenniferisasi
Copy link
Contributor Author

jenniferisasi commented Feb 3, 2024

Excellent work, @mcraed2004 (and team)! Detailed and thorough information on how to use Tropy.

I would say that it needs a bit more on the use case to help readers see better how using Tropy can get them to asking their research questions. What is the value (besides organizing) of having the collection of the ANHV in Tropy? Can metadata show some patterns? Brainstorming here. I am also going to chat with the rest of editors and @anisa-hawes about the last part of the lesson, re Zotero, to see what they think about having that part in the lesson.

Besides that, here are a few suggestions from my first pass to the text, nothing too big:

  • 16. "encontrabilidad [discoverability] de las fuentes" --sugiero "facilitar el descubrir o localizar las fuentes"
  • 16. "imágenes de las siguientes formatos:" --sugiero "imágenes en los siguientes formatos:"
  • 17. sugieron añadir el enlace a wikipedia para "la resolución en píxeles (ppi)" https://es.wikipedia.org/wiki/Píxeles_por_pulgada
  • 18 y 19. Creo que valdría apuntar aquí que se necesita comprobar el tamaño/formato/calidad de la imagen a arrastrar a Tropy dependiendo del tipo de uso que se hará de la foto. Por ejemplo, la imagen de la LoC (un mapa), al ser arrastrada de la ventanilla, se copia en formato muy bajo y JPG, pero uno puede descargar una resolución mayor.
  • 22. La nota de OJO puede ir en una caja de advertencia.
  • 26. 2 cosas. 1 "Además, Tropy viene con una plantilla utilizando todas las quince propiedades de metadatos de Dublin Core" - sugiero cambiar a "Además, Tropy viene con una plantilla con los quince elementos de metadatos de Dublin Core". 2 sugiero cambiar el enlace en Dublin Core a su artículo en español en Wikipedia.
  • 33. Creo que quisiste poner "(1 feb 1730 se convertirá en 1730-02-01)" en vez de "(1730-02-01 se convertirá en 1 feb 1730)", para reflejar el cambio a fecha ISO. Añadiría un enlace a la Wikipedia en español para las siglas ISO
  • 36. Reemplazar "buscabilidad" por "facilidad de búsqueda"
  • 38. "lementos y términos de Dublin Core, vocabularios RDF, y European Data Model y vocabularios relacionados" añadir enlaces a información en español, menos de DC que ya está más arriba.
  • 41. personalizada
  • 42. sobra el paréntesis y falta una coma, delante de "por ejemplo"
  • 44. Se repite más o menos el párrafo 38 - ¿se puede editar o borrar? Puede la moción de vocabulario controlado insertarse en otra sección, quizás con un enlace a más información?

More soon @mcraed2004! Gracias por tu labor.

@mcraed2004
Copy link
Collaborator

@jenniferisasi thank you for these suggestions. I've only now had the opportunity and most of these look easily resolvable. @anisa-hawes @charlottejmc What is the procedure to resubmit with revisions? Should I edit the file at /es/borradores/originales/gestionar-fuentes-primarias-digitales-con-tropy.md or resubmit some other way? This is my first time revising a publication via Github, and some of it is not intuitive for me.

@mcraed2004
Copy link
Collaborator

mcraed2004 commented Feb 23, 2024

@jenniferisasi @charlottejmc @anisa-hawes Also, in terms of assets. For the purpose of the tutorial, I think it would be sufficient to upload one of the scanned volumes (tomos) to the assets repository. The issue is these are quite large (the one I use in the tutorial is 340 MB). By using this asset however, users will be able to practice exploding and merging objects in Tropy as well as applying and completing metadata templates. I can upload that file to the assets folder if you would like (as well as make more clear mention of it in the tutorial). EDIT: Looking at the question of hosting more assets via Zenodo--I don't think this is necessary, as users who want to explore further can access the larger dataset themselves. Creating a login is free, and that's where the metadata is hosted as well. To work most effectively with different tomes, users would need access to those metadata.

In terms of @jenniferisasi 's comments regarding the use case--I can attempt to elaborate on some of the advantages in an additional paragraph, perhaps in the Dataset section. In addition to helping divide these annual volumes into individual cases (expedientes), templates can describe metadata including author and location that allows the user to organize their Tropy project in ways that would be more cumbersome if users consulted the website or via PDFs stored on their drive. I've also noticed a change since I've started working with these files, or at least one I didn't notice before. The project managers have added a list of "Temas" in the metadata to help with searchability on their database. These terms are based on an index written by another scholar and published in physical format some decades ago. In any case, Tropy would allow users to incorporate these terms into their projects via tags (etiquetas), and indeed would allow them to do so on a more specific level, as the "temas" terms are applied at the level of the tomo, not the individual expediente.

Let me know as well what you decide on the Zotero section as well. In Tropy workshops, I often was asked to describe the relationship between these two tools, and I thought that maybe this would be an instance to not just describe steps but also the pros and cons of importing metadata from Tropy into Zotero references.

I look forward to completing these revisions soon.

@anisa-hawes
Copy link
Contributor

Hello Douglas @mcraed2004,

Yes, you can make edits to lesson file here: /es/borradores/originales/gestionar-fuentes-primarias-digitales-con-tropy.md. I've double-checked that you have the Write access you need. We don't use the Git Pull Request system in our Submissions Repository, so you can make direct commits to your file. If you experience any difficulties, I'm happy to help.

In terms of the assets, we could create a .zip file containing one of the tomos and save this in our /assets directory for readers to download? Then, as you suggest, we could include a link to a larger dataset at the original source rather than replicate it to host ourselves.

Thank you for all the work you've done already to think-through Jenn's feedback ✨

mcraed2004 added a commit that referenced this issue Mar 8, 2024
Resolved comments/edits from checkbox list in #598 (comment)
@jenniferisasi
Copy link
Contributor Author

Thank you @mcraed2004! As I am re-reading the lesson to see the changes made, I am wondering @anisa-hawes, shall we proceed with reviews already or should we wait for the FR and PT lesson to be also in the pipeline? I see both advantages and disadvantages to proceed or not. What do you think?

@anisa-hawes
Copy link
Contributor

anisa-hawes commented Mar 13, 2024

Thank you for your work on the revisions, Douglas @mcraed2004

You will note that I have removed this {% syntax which you used to surround notes-to-yourself at lines 82 and 102. Unfortunately, this interferes with the liquid syntax which we use to reference and caption figure images, causing an error in our site build. I realise these notes are temporary, but I've taken the step to instead indent them with a > at the start of the line so that the build runs smoothly for other users.

--

Thank you for this question, @jenniferisasi. I have the sense that it may be preferable to wait until the FR and PT lessons in this trio are at the same Phase of the workflow -- this would also allow the reviews to be collated and discussed within the editorial 'seminars'. But I'd be keen to hear more of your thoughts about the benefits of each option. Shall we chat about this when we meet? [I've just replied to your other message on Slack ☺️ ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 2 Initial Edit
Development

No branches or pull requests

5 participants