this post was submitted on 24 Jul 2024
122 points (94.9% liked)

main

15736 readers
84 users here now

THE MAIN RULE: ALL TEXT POSTS MUST CONTAIN "MAIN" OR BE ENTIRELY IMAGES (INLINE OR EMOJI)

(Temporary moratorium on main rule to encourage more posting on main. We reserve the right to arbitrarily enforce it whenever we wish and the right to strike this line and enforce mainposting with zero notification to the users because its funny)

A hexbear.net commainity. Main sure to subscribe to other communities as well. Your feed will become the Lion's Main!

Top Image of the Month will remain the Banner for a Month

Good comrades mainly sort posts by hot and comments by new!


gun-unity State-by-state guide on maintaining firearm ownership

guaido Domain guide on mutual aid and foodbank resources

smoker-on-the-balcony Tips for looking at financials of non-profits (How to donate amainly)

frothingfash Community-sourced megapost on the main media sources to radicalize libs and chuds with

just-a-theory An Amainzing Organizing Story

feminism Main Source for Feminism for Babies

data-revolutionary Maintaining OpSec / Data Spring Cleaning guide


ussr-cry Remain up to date on what time is it in Moscow

founded 4 years ago
MODERATORS
 

Account in picture was banned within minutes of posting this information

you are viewing a single comment's thread
view the rest of the comments
[–] footfaults@hexbear.net 60 points 2 months ago (1 children)

ignore-wordlist-regex

Not a regex

squidward-nochill

squidward-chill

Fake

[–] PaX@hexbear.net 24 points 2 months ago* (last edited 2 months ago) (1 children)

Assuming all those substrings of usernames are split into different expressions like "EndWokeness" (without quotes ofc) it's valid regex, it's just an exact match

Mainposting on main rn

[–] footfaults@hexbear.net 20 points 2 months ago (2 children)

See the thing is, they're not even strings. They're not enclosed in quotes.

[–] PaX@hexbear.net 12 points 2 months ago* (last edited 2 months ago) (1 children)

A program that separates these out of that config file/representation/whatever this is (idk what Okta is tbh) into individual substrings is really easy

I actually wrote a regex that matches each username in that structure for fun lol: (?<=[\[| ])[a-zA-Z0-9_]+ but you can match regex with regex too (It's long as hell though, relatively)

You can try it in your browser at: https://regex101.com/

Idk why they called that field .*regex though, probably cuz it's fake

Wait I can write a better regex lol

Edit: this will match all substrings properly inside that structure, including more regexes, correctly(edit edit: WRONG!): (?<=[\[| ])[^,]+(?=[,|\]])

Edit edit: It's all fucked and my brain hurts now because I wanted to match any valid regex inside of that structure, separately

I will be back with the ultimate regex later, probably recursive and with the caveat that if you want to use comma literals, you will have to escape them, call that shit X-regex (special X.com regex syntax)

I am going to bed I am so tired

I'm sorry, there's no way I can write this in a sane-sounding way, it's been a journey and I'm probably drastically overcomplicating this

[–] alexandra_kollontai@hexbear.net 4 points 2 months ago (1 children)

you can match regex with regex too

You can't because the regular languages cannot describe properly nested parentheses.

[–] PaX@hexbear.net 2 points 2 months ago* (last edited 2 months ago)

I should not have said that, I'm sorry I was really tired, but I think it's also more complicated than no

Many implementations of "regular expressions" are actually capable of describing more than regular languages

Like Perl/PCRE's regular expression parser (which I used to write the above regexes) is capable of recursive evaluation and backreferences and probably other stuff I don't know about cuz I don't use it very often

I don't actually know if you can or not but yeah

Tbh, you probably know more about formal language theory than me blob-no-thoughts

[–] PaX@hexbear.net 2 points 2 months ago* (last edited 2 months ago) (2 children)

Hi, idk if you saw my other reply, but I have returned with the promised regular expression capable of doing this kitty-cri-potato (at least when parsed by PCRE and similar)

(?<=(?<!\\)\[|(?<!\\), ).*?(?=(?<!\\)]|(?<!\\),)

It will match each username/regex matching usernames as an individual substring

In the end I didn't need recursion and managed to accomplish the task by using nested lookarounds and making the assumption that brackets and commas are escaped with backslashes. It could probably be further simplified by using subroutines that some regex parsers are capable of using. Also it is most likely possible to write a regex that doesn't require escaping brackets, besides when you need to escape brackets when writing regexes anyway ofc. The requirement that commas be escaped is analogous to requiring that quotes be escaped inside a string literal if instead the usernames/regexes that match usernames were enclosed in quotes

I uhhhh don't write many regexes blob-no-thoughts

There's probably something to be said about this task being easier to accomplish in languages "more powerful" (in chomsky-yes-honey terms: "Chomsky hierarchy") than regular languages but I'm not chomsky-yes-honey (kamala-coconut-tree-free languages, etc contextphobic )

I have a tendency toward owning myself, if you find a way to break my regex without breaking the assumptions specified above I will be further owned

[–] footfaults@hexbear.net 3 points 2 months ago (1 children)

Here's the the thing. The snippet that was posted by that account is not any known format. It's not YAML, it's not INI syntax, it's not JSON, or TOML, or anything that is a common configuration syntax. It's not valid JS. It's bullshit. It's just close enough to programming code that it would maybe convince some people.

So, while you spent a lot of time proving that you if you were forced to work with this file, there's an incredibly nasty set of regex and parsing that you could do to make this actually work, there's absolutely no reason why this would be done.

[–] PaX@hexbear.net 1 points 2 months ago

People rolling their own formats isn't really that uncommon. And besides, that file is one s/ =/:/g away from being valid YAML. There might even be a YAML or TOML parser around that will accept this, idk

there's an incredibly nasty set of regex and parsing that you could do to make this actually work

It only really looks nasty cuz I wanted to "parse" this file mentioning regex with regex for fun lol, it's basically YAML

The syntax of that file isn't the sus part imo. I feel like I'm being an annoying pedant but yeah

[–] footfaults@hexbear.net 3 points 2 months ago* (last edited 2 months ago) (1 children)

I saw your comments. I was trying to get home and use my FreeBSD machine to reply since I was on mobile today and of course amdgpu decided to start causing kernel panics so it ate all my time. I'll respond probably tomorrow evening.

You've done enough work where it is worth having a full discussion instead of trying to type something up on my mobile device. It's all good stuff, you put the work in

[–] PaX@hexbear.net 2 points 2 months ago* (last edited 2 months ago) (1 children)

Ohh, you're good, I hope I didn't make you feel pressured to reply or anything. Sometimes I just start writing and it ends up being a lot lol, besides I just like writing regular expressions sometimes lol

Also 07 to another BSD user. I'm speaking to you with an OpenBSD machine rn lol. I hope you got your kernel panicking fixed

Post stack trace here if you want help perhaps, I've also had to debug BSD kernels before (although the graphics stuff is mostly ripped straight from Linux lol)