SEARCH

Enter your search query in the box above ^, or use the forum search tool.

You are not logged in.

#1 2015-08-31 07:57:11

ohnonot
...again
Registered: 2012-05-22
Posts: 2,205

[Solved] how does geany recognize syntax, not based on filename? mime?

yes, it can do that.
with shell scripts for example. or python.
even though they don't end in .sh or .py.
most probably it comes from that first bit at the beginning of the file:

#!/bin/sh
or
#!/bin/bash
or
#!/bin/env python

and here's the problem:
i'd like a similar functionality for my config files.
I'm soooooo tired of doing this every time:
geany-filetypes-conf.png
...so i'd like to add a similar bit to the top of my config files.

however, i have no clue how to do that.

i have looked at these files:
/usr/share/geany/filetypes.sh
/usr/share/geany/filetypes.conf

- but the only difference i can see is that filetypes.sh has a bit about mimetypes, which filetypes.conf hasn't.

so i thought, ok, could be mimetypes (but i'm really not sure if i'm on the right track) - and had a look at
/usr/share/mime/application/x-shellscript.xml
but i don't see how the magic happens.
(and no, there's nothing with "conf" under /usr/share/mime)

any ideas/pointers?

i have also tried to randomly put funny-looking bits at the beginning of my extensionless config files, like .e.g.
#conf
or
#!<conf>

Last edited by ohnonot (2015-09-02 18:50:24)

Offline

Help fund CrunchBang, donate to the project!

#2 2015-08-31 08:43:39

damo
#! gimpbanger
From: N51.5 W002.8 (mostly)
Registered: 2011-11-24
Posts: 5,434

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

Have you looked at Geany Custom filetypes?

An important thing to remember is to add "lexer_filetype=..." in the appropriate place.


BunsenLabs Group on deviantArt
damo's gallery on deviantArt
Openbox themes
Forum Moderator smile

Offline

#3 2015-08-31 12:00:39

flaneur
#! Member
Registered: 2014-01-24
Posts: 99

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

ohnonot wrote:

...
and here's the problem:
i'd like a similar functionality for my config files.
...

Could you please provide the names of some of your config files, the ones without extensions?

For me, files like gtkrc, tint2rc, leafpadrc, .conkyrc, all are seen by Geany 1.23.1 as filetype "config". (As are those with .conf or .ini extensions.)

Last edited by flaneur (2015-08-31 12:04:46)

Offline

#4 2015-08-31 20:57:55

ohnonot
...again
Registered: 2012-05-22
Posts: 2,205

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

@damo:
i thought the lexer is for actually reading the code and creating the syntax highlighting.
i'm not sure if that is what i want - i do not want to change the way "config" files are understood.
i just want geany to recognize them regardless of their filename, like it does with e.g. shell or python scripts.
--------------------------------------------------------------------------------
@flaneur:
thanks for asking. i was wrong to say "extension". i meant files that don't contain "conf", "config", "cfg" or "rc".
e.g., a config file called "default".
--------------------------------------------------------------------------------


one could say, i want to understand how geany understands that a shell script is a shell script, even though it doesn't end in .sh.
btw, this also works for python scripts.
i guess it's the hashbang (or crunchbang big_smile) at the beginning of the file.
but how does it get interpreted? by the lexer?

ideally, i'd like to have a custom hashbang for config files.

ps:
i can see the same magic happening in my filemanager, which leads me to believe it has something to do with mimetypes.
there doesn't seem to be a config file mime type, but i read that it's possible to create custom mimetypes.

Offline

#5 2015-09-01 00:43:23

tknomanzr
#! Die Hard
From: Heavener, OK
Registered: 2014-12-09
Posts: 777

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

Linux doesn't really care about file extensions. It looks at the file to determine its extension. One way to go about it, which doesn't mess with mime-types is just to put your skeleton into ~/Templates, open the template from there then save it to wherever you need it.

Offline

#6 2015-09-01 02:41:58

johnraff
nullglob
From: Nagoya, Japan
Registered: 2009-01-07
Posts: 4,148
Website

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

ohnonot wrote:

i do not want to change the way "config" files are understood.
i just want geany to recognize them regardless of their filename, like it does with e.g. shell or python scripts.

I was trying to get geany to recognize conky config files the other day, and also wondering how to get it to recognize shell code in a file without a hashbang... that's the extent of my qualification, but this is how it seems to me:

  • If there's a hashbang at the top, that gets priority. However, you can't add new "custom" hashbangs, you're stuck with the default list.

  • Filetypes are also recognized from their names, and custom filetypes are possible. eg you can create a new filetype "Conky file" and have any filename that matches *conky* recognized.

  • Mime-types are ignored.

  • To highlight a new filetype it's easiest to borrow the template from something that exists already and edit to taste.

People who know better, please correct this!

also:

damo wrote:

Have you looked at Geany Custom filetypes?
An important thing to remember is to add "lexer_filetype=..." in the appropriate place.

(I forgot the "lexer_filetype=..." sad )


John
--------------------
( a boring Japan blog , Japan Links, idle twitterings  and GitStuff )
#! forum moderator    BunsenLabs

Offline

#7 2015-09-01 05:37:09

flaneur
#! Member
Registered: 2014-01-24
Posts: 99

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

I came across this link: https://www.ibm.com/developerworks/comm … ux?lang=en

I read the man page and found that there are three sets of tests, performed in this order: file system tests, magic tests, and language tests.
...
The file-system tests are based on examining the return from a stat(2)
system call. The program checks to see if the file is empty, or if it's
some sort of special file. Any known file types appropriate to the system
you are running on (sockets, symbolic links, or named pipes (FIFOs) on
those systems that implement them) are intuited if they are defined in the
system header file .
...
The magic tests are used to check for files with data in particular fixed
formats. These files have a “magic number” stored in a particular place
near the beginning of the file that tells the UNIX operating system that
the file is a binary executable, and which of several types thereof. The
concept of a “magic” has been applied by extension to data files.
...
If a file does not match any of the entries in the magic file, it is
examined to see if it seems to be a text file.
...
Once file has determined the character set used in a text-type file, it
will attempt to determine in what language the file is written. The
language tests look for particular strings (cf. ) that can appear anywhere
in the first few blocks of a file. For example, the keyword .br indicates
that the file is most likely a troff(1) input file, just as the keyword
struct indicates a C program. These tests are less reliable than the
previous two groups, so they are performed last.

I don't know whether the above is helpful but can someone please explain what is meant by "The language tests look for particular strings (cf. ) that can appear anywhere in the first few blocks of a file" in the last part of the quote?

Last edited by flaneur (2015-09-01 05:53:26)

Offline

#8 2015-09-01 05:56:14

johnraff
nullglob
From: Nagoya, Japan
Registered: 2009-01-07
Posts: 4,148
Website

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

@flaneur - that's about the 'file' command though. I'm not sure if geany uses file when determining filetypes.


John
--------------------
( a boring Japan blog , Japan Links, idle twitterings  and GitStuff )
#! forum moderator    BunsenLabs

Offline

#9 2015-09-01 06:09:27

flaneur
#! Member
Registered: 2014-01-24
Posts: 99

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

johnraff wrote:

@flaneur - that's about the 'file' command though. I'm not sure if geany uses file when determining filetypes.

Even I don't know that but if I double-click on .conkyrc (or the other config files I mentioned earlier), Geany opens it with syntax highlighting and Document > Set Filetype > Miscellaneous > Config file is what I see.

I'm pretty sure I haven't done anything to train Geany.

(Confession: I'm using the Openbox session of Lubuntu 14.04 and installed Geany 1.23.1 from the stock Ubuntu repo.)

My guess is that if Geany opens a plain text file, it scans the "initial" part of the file to somehow get information about whether the file is actually plain text or a config file and then automatically uses syntax highlighting if it feels it is appropriate.

Offline

#10 2015-09-01 07:42:55

ohnonot
...again
Registered: 2012-05-22
Posts: 2,205

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

@tknomanzr: huh?
in case my first post was confusing, i rephrased it a little.

johnraff wrote:

However, you can't add new "custom" hashbangs, you're stuck with the default list.

do you know what/where that default list is?

Filetypes are also recognized from their names, and custom filetypes are possible. eg you can create a new filetype "Conky file" and have any filename that matches *conky* recognized.

this is understood, but it's not what i'm asking (@flaneur!).

Mime-types are ignored.

are you sure?
see here:

ohnonot wrote:

i have looked at these files:
/usr/share/geany/filetypes.sh
/usr/share/geany/filetypes.conf
- but the only difference i can see is that filetypes.sh has a bit about mimetypes, which filetypes.conf hasn't.

-

To highlight a new filetype it's easiest to borrow the template from something that exists already and edit to taste.

i guess you mean a syntax highlighting filetype template?
again, this is not what this thread is about.
i'm happy with how config files get highlighted.
...unless there's something (the lexer?) that reads the first bit and makes highlighting decision based on that.

i'm off to work now, but i guess i'll be looking into mimetypes next.
iirc, it is possible to create new mimetypes.
just not sure what's happening, the files in /usr/share/mime don't seem to contain much.

Offline

#11 2015-09-01 08:20:41

johnraff
nullglob
From: Nagoya, Japan
Registered: 2009-01-07
Posts: 4,148
Website

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

ohnonot wrote:
johnraff wrote:

However, you can't add new "custom" hashbangs, you're stuck with the default list.

do you know what/where that default list is?

Sorry, I've no idea. Not even sure if that statement is true, it's just the impression I've got so far. Ready - eager - to be corrected...

Mime-types are ignored.

are you sure?

Again, not sure, but it's consistent with my experience so far.

see here:

ohnonot wrote:

i have looked at these files:
/usr/share/geany/filetypes.sh
/usr/share/geany/filetypes.conf
- but the only difference i can see is that filetypes.sh has a bit about mimetypes, which filetypes.conf hasn't.

I couldn't find any reference to mimetypes in filetypes.sh (but quite a lot of differences between the two files). I think those filetypes.* files are about how various filetypes should be treated, not how they're detected.

i'm happy with how config files get highlighted.
...unless there's something (the lexer?) that reads the first bit and makes highlighting decision based on that.

Well, the hashbang is definitely being read, and even a file called something.conf will be treated as a shell script if it starts with `#!/bin/bash'.

flaneur wrote:

For me, files like gtkrc, tint2rc, leafpadrc, .conkyrc, all are seen by Geany 1.23.1 as filetype "config". (As are those with .conf or .ini extensions.)

This is defined in /usr/share/geany/filetype_extensions.conf. I don't know of any other way of recognizing config files, though eager to learn...

Last edited by johnraff (2015-09-01 08:21:49)


John
--------------------
( a boring Japan blog , Japan Links, idle twitterings  and GitStuff )
#! forum moderator    BunsenLabs

Offline

#12 2015-09-01 12:42:30

flaneur
#! Member
Registered: 2014-01-24
Posts: 99

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

johnraff wrote:

...

flaneur wrote:

For me, files like gtkrc, tint2rc, leafpadrc, .conkyrc, all are seen by Geany 1.23.1 as filetype "config". (As are those with .conf or .ini extensions.)

This is defined in /usr/share/geany/filetype_extensions.conf. ...

Thanks for that!

I copied over /usr/share/geany/filetype_extensions.conf to ~/.config/geany. Then I edited the copied over file to change

Conf=*.conf;*.ini;config;*rc;*.cfg;*.desktop;control;

to

Conf=*.conf;*.ini;config;*rc;*.cfg;*.desktop;control;default;

. Then, I made a copy of .conkyrc called default. Opening either file in Geany shows the same syntax highlighting and Geany sees both as "Config file": http://i.imgur.com/hkYotoy.png

Offline

#13 2015-09-01 20:54:24

ohnonot
...again
Registered: 2012-05-22
Posts: 2,205

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

got it.

at first i searched the web for "create mimetype" but didn't find anything that goes beyond filename extension globbing.

then i did some tests:
will my file manager and geany behave the same if i remove the hashtag in the first line (=not recognize the filetype anymore)?
turned out, they didn't.
so fair bet, there must be some geany-internal mechanism at work.
and i found it here:

The number of lines to search for the filetype with the extract filetype regex.   |   2

meaning: geany searches the first 2 lines of every file for some hashtag to tell it what filetype it is.
so chances are good i will find what i need in filetypes.c.

there's a section that basically says "if the first 2 characters in the first 2 lines are '#!', then determine filetype from the next word (source code for geany 1.25):

if (strlen(line) > 2 && line[0] == '#' && line[1] == '!')
	{
		static const struct {
			const gchar *name;
			filetype_id filetype;
		} intepreter_map[] = {
			{ "conf",GEANY_FILETYPES_CONF },
			{ "sh",		GEANY_FILETYPES_SH },
			{ "bash",	GEANY_FILETYPES_SH },
			{ "dash",	GEANY_FILETYPES_SH },
			{ "perl",	GEANY_FILETYPES_PERL },
			{ "python",	GEANY_FILETYPES_PYTHON },
			{ "php",	GEANY_FILETYPES_PHP },
			{ "ruby",	GEANY_FILETYPES_RUBY },
			{ "tcl",	GEANY_FILETYPES_TCL },
			{ "make",	GEANY_FILETYPES_MAKE },
			{ "zsh",	GEANY_FILETYPES_SH },
			{ "ksh",	GEANY_FILETYPES_SH },
			{ "mksh",	GEANY_FILETYPES_SH },
			{ "csh",	GEANY_FILETYPES_SH },
			{ "tcsh",	GEANY_FILETYPES_SH },
			{ "ash",	GEANY_FILETYPES_SH },
			{ "dmd",	GEANY_FILETYPES_D },
			{ "wish",	GEANY_FILETYPES_TCL },
			{ "node",	GEANY_FILETYPES_JS },
			{ "rust",	GEANY_FILETYPES_RUST }
		};

it srtarts at line 608.
here i already added my extra "conf" line.

now i have a custom geany build.
every file that starts with

#!conf

is interpreted as filetype conf.

here is a PKBUILD for archlinux.

all that said, flaneur's is still a valuable addition.
and if someone has an idea how to achieve the same thing without recompiling, well, i'm listening.

Last edited by ohnonot (2015-09-01 20:58:13)

Offline

#14 2015-09-02 01:33:38

johnraff
nullglob
From: Nagoya, Japan
Registered: 2009-01-07
Posts: 4,148
Website

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

Great work ohnonot! At least now we know it's compiled-in.

I'll just throw in one more thing I remembered - it's possible to over-ride geany's auto-detection of encoding (which usually works, but not always) with a commented-out header like 'coding SJIS' . "These specifications must be in the first 512 bytes of the file.". So apart from the #! hashbang, a certain amount of other parsing is going on.

I don't know if there'd be any possibility of a compiled-in custom header like 'filetype:Conf'? That would be really handy because you'd be able to set any filetype...


John
--------------------
( a boring Japan blog , Japan Links, idle twitterings  and GitStuff )
#! forum moderator    BunsenLabs

Offline

#15 2015-09-02 18:49:13

ohnonot
...again
Registered: 2012-05-22
Posts: 2,205

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

johnraff wrote:

I don't know if there'd be any possibility of a compiled-in custom header like 'filetype:Conf'? That would be really handy because you'd be able to set any filetype...

you mean already compiled in, or possible to patch?
if the latter, take a look at filetypes.c, starting from the aforementioned line 608, there's some more parsing happening. maybe you can figure it out.

Offline

#16 2015-09-03 02:20:20

johnraff
nullglob
From: Nagoya, Japan
Registered: 2009-01-07
Posts: 4,148
Website

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

ohnonot wrote:

you mean already compiled in, or possible to patch?

Sorry, but I don't understand the difference. With open-source packages, surely anything is possible to patch?
EDIT: ah I think I understand - you mean options that are already set up in the source code to be easily changed?

Anyway, thanks for the hint - I'll have a look at filetypes.c, though it's unlikely I'll be able to do much...

encodings.c is where the 'encoding:sjis' type headers are read, and in theory it looks feasable to do the same sort of thing with 'filetype:Conf' etc and put it in filetypes.c, but I'll have to learn C first.

Last edited by johnraff (2015-09-03 02:42:25)


John
--------------------
( a boring Japan blog , Japan Links, idle twitterings  and GitStuff )
#! forum moderator    BunsenLabs

Offline

#17 2015-09-03 02:30:38

flaneur
#! Member
Registered: 2014-01-24
Posts: 99

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

Good work! I learned a lot from this thread. Maybe you could suggest that Geany make it easier to use custom "headers". The Geany mailing lists are pretty active: http://www.geany.org/Support/MailingList.

Offline

#18 2015-09-04 07:26:49

ohnonot
...again
Registered: 2012-05-22
Posts: 2,205

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

flaneur wrote:

The Geany mailing lists are pretty active: http://www.geany.org/Support/MailingList.

thanks.
good, because i have another question to them...

@johnraff:
basically i just asked if you feel comfortable compiling geany yourself... if yes, then i think it's easy to add more options.
you should look at the geany documentation first (linked earlier by damo), and at my code example.
if there's a "sjis" lexer (whatever that is?), then you could just add another line:

		{ "sjis",	GEANY_FILETYPES_SJIS },

which would make files starting with "#!sjis" recognized as sjis language files.

Offline

#19 2015-09-04 13:39:15

flaneur
#! Member
Registered: 2014-01-24
Posts: 99

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

Could you please explain

if (strlen(line) > 2 && line[0] == '#' && line[1] == '!')

I don't know any coding language but doesn't the code above look like '#' should be on the first line (line 0) and '!' be on the second line (line 1)?

Offline

#20 2015-09-04 13:50:52

iMBeCil
WAAAT?
From: Edrychwch o'ch cwmpas
Registered: 2012-03-22
Posts: 1,026
Website

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

flaneur wrote:

Could you please explain

if (strlen(line) > 2 && line[0] == '#' && line[1] == '!')

I don't know any coding language but doesn't the code above look like '#' should be on the first line (line 0) and '!' be on the second line (line 1)?

The 'line' here is a variable, not statement, and 'line[0]' represents the very first character of (string) variable 'line'.

Similarly, 'line[1]' represents the second character of (string) variable 'line'.

HTH.


Postpone all your duties; if you die, you won't have to do them ..
--> The very new BL forum! <--

Offline

#21 2015-09-04 14:16:00

flaneur
#! Member
Registered: 2014-01-24
Posts: 99

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

iMBeCil wrote:

...
The 'line' here is a variable, not statement, and 'line[0]' represents the very first character of (string) variable 'line'.

Similarly, 'line[1]' represents the second character of (string) variable 'line'.

HTH.

Thanks! That explains it very well.

Offline

#22 2015-09-04 14:22:52

iMBeCil
WAAAT?
From: Edrychwch o'ch cwmpas
Registered: 2012-03-22
Posts: 1,026
Website

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

^You're welcome.


Postpone all your duties; if you die, you won't have to do them ..
--> The very new BL forum! <--

Offline

#23 2015-09-06 08:44:26

johnraff
nullglob
From: Nagoya, Japan
Registered: 2009-01-07
Posts: 4,148
Website

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

ohnonot wrote:

@johnraff:
basically i just asked if you feel comfortable compiling geany yourself... if yes, then i think it's easy to add more options.

hmm... they say geany's easy to compile, so maybe OK.

if there's a "sjis" lexer (whatever that is?), then you could just add another line:

		{ "sjis",	GEANY_FILETYPES_SJIS },

which would make files starting with "#!sjis" recognized as sjis language files.

There's no SJIS (or Shift_JIS) filetype, it's an encoding which is quite a different issue. I just raised it because Geany can read "encoding:something" headers in the first 512 bytes (I think) of a file so thought maybe it could be hacked to read "filetype:" headers too.


John
--------------------
( a boring Japan blog , Japan Links, idle twitterings  and GitStuff )
#! forum moderator    BunsenLabs

Offline

#24 2015-09-06 11:15:43

ohnonot
...again
Registered: 2012-05-22
Posts: 2,205

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

^ sounds like something that happens in HTML, i mean to look for the encoding at the beginning of the file.
afaiu, it's possible to combine lexers for geany...

what's sjis exactly, and what sort of problem do you have with it?

Offline

Be excellent to each other!

#25 2015-09-07 03:06:54

johnraff
nullglob
From: Nagoya, Japan
Registered: 2009-01-07
Posts: 4,148
Website

Re: [Solved] how does geany recognize syntax, not based on filename? mime?

ohnonot, we seem to be getting a bit tangled up here. I don't have any problems with SJIS. It's an encoding, one of the common ones for Japanese, along with EUC-JP and ISO-2022-JP. Of course more and more people are using UTF-8 these days, but when I was hacking at web pages the standard for Japanese was Shift_JIS because it was what was used in Windows. (I don't know whether that's still the case.)

Geany is able to detect the encoding of html pages from the meta-data at the top of the file but for some reason failed to detect SJIS (as Shift_JIS seems to be called in Linux-land). Other apps seemed to have trouble with it too. Anyway, adding the 'encoding:SJIS' header enabled Geany to auto-detect the encoding.

It's nothing to do with lexers or highlighting, it's about rendering characters correctly. I mentioned it as an example of Geany being able to read a custom header and wondered if it might be possible to adapt it to a header like 'filetype:Conky', for example, to get custom filetypes recognized by something other than their filename.

Interesting, and something I might play with a bit later...


John
--------------------
( a boring Japan blog , Japan Links, idle twitterings  and GitStuff )
#! forum moderator    BunsenLabs

Offline

Board footer

Powered by FluxBB

Copyright © 2012 CrunchBang Linux.
Proudly powered by Debian. Hosted by Linode.
Debian is a registered trademark of Software in the Public Interest, Inc.
Server: acrobat

Debian Logo