Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linguist overrides in .gitattributes #452

Open
egthomas opened this issue Aug 18, 2021 · 1 comment
Open

Linguist overrides in .gitattributes #452

egthomas opened this issue Aug 18, 2021 · 1 comment

Comments

@egthomas
Copy link
Member

egthomas commented Aug 18, 2021

Github uses something called linguist to detect the language / file type of repositories for the purposes of language statistics shown on the main page, syntax highlighting, and organizing search results. In this repository for the RST, a significant number of files are misclassified, eg

  • IDL .pro files as Prolog (e.g., aacgm_v2.pro)
  • C .h files as C++
  • hdw.dat files as all kinds of different extensions (e.g, hdw.dat.sas)
  • AACGM .asc coefficient files as AGS script

The .gitattributes file I've tried using to override this behavior looks something like this:

*.pro linguist-language=IDL
*.h linguist-language=C

hdw.dat.* -linguist-detectable
*.asc -linguist-detectable

but I'm not sure if this is correct, and it is not clear to me under what circumstances the language statistics / syntax highlighting will actually be updated (eg does the file need to be modified by a new commit, or does a new version of linguist need to be pushed out to Github, etc).

Is this something worth pursuing? I personally find it helpful to have the correct syntax highlighting available when browsing code on Github, and it can also be difficult to use the search feature when the file you're looking for appears under the wrong language.

@ecbland
Copy link

ecbland commented Aug 19, 2021

@egthomas Completely agree that the misclassified files are annoying!

I just installed github-linguist locally and put your example .gitattributes file in the root directory of RST:

$ github-linguist 
80.52%  4807712    C
13.80%  823872     IDL
3.39%   202673     Makefile
1.22%   72673      Shell
0.98%   58274      Prolog
0.05%   2974       CSS
0.02%   1270       JavaScript
0.01%   627        Scheme
0.01%   621        SAS
0.01%   478        Scilab

So it doesn't seem to detect the .gitattributes file. I then renamed it to .git/info/attributes:

$ github-linguist 
80.52%  4807712    C
14.79%  883257     IDL
3.39%   202673     Makefile
1.22%   72673      Shell
0.05%   2974       CSS
0.02%   1270       JavaScript

This looks promising! I don't know why the .gitattributes file doesn't work, since the documentation suggests otherwise. It might also behave differently on github compared to a local install.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants