C# Regex appending every 2nd lot?

master maste

New Member
Reaction score
32
I'm currently using regex on some of my webpages, and I'm adding the text to a .txt file on my computer.

Code:
Regex regexObj = new Regex("(?<=<div class=\"item-artist\" style=\"display:none;\">)(.*?)(?=</div>)|((?<=<div class=\"item-title\" style=\"display:none;\">)(.*?)(?=</div>))");

I'm currently appending
Code:
textBox.AppendText(matchResults.Value + " - \r\n");

Everything up to here works perfectly; I'm wondering how to only add the appended text to every 2nd lot.

I want to display the artist with a dash and then the title.

Code:
Artist - Title
Artist - Title
Artist - Title
Artist - Title

^^Like that, at the moment it does this.

Code:
Artist - Title - Artist - Title - Artist - Title - Artist - Title


Thanks for any help you may be able to give, I've been googling all night.
 

master maste

New Member
Reaction score
32
Looks like this:

Code:
<div class="item-title" style="display: none;">TitleIWant</div>

<div class="item-artist" style="display: none;">ArtistIWant</div>
 

Icyculyr

I'm a Mac
Reaction score
68
Can you post your full code so I can have a look at it better?
textBox.AppendText(matchResults.Value + " - \r\n");
I assume this is in some kind of a loop, why can't you just use an int to count every second value and append the text differently? or have I misunderstood?

I don't remember the C# syntax that well (I use VB), however I'd do something like this:
Code:
int i = 0;
(in your loop)
if (i % 2 == 0)
{
    //Display result with -
}
else
{
    //Display result without -
}
i += 1;

iirc % is the modulo symbol, I'd have to see more of your code to give a more detailed reply.
 

master maste

New Member
Reaction score
32
Thanks for the help guys, this has been great.

I'm having some more regex issues, and I might as well put them here instead of creating another thread for the same thing.

I'm trying to pull the text from between <TITLE></TITLE> tags. I'm currently using:

Code:
Regex regexObj = new Regex("(?<=<TITLE>)(.*)(?=</TITLE>)");

I've tried various combinations, searched google, read tutorials etc but its not finding anything. I can however find <TITLE> and </TITLE> separately...

I'm currently working on the page through my localhost and looking at the sites html code it should show this:

Code:
<TITLE>
      
        
        
        
        
          Excellent Monitor: Viewsonic VX2433WM
        
      
      
    </TITLE>
Thats the exact formatting with all the spaces etc.(maybe the weird gapping is causing my issue?)

anyways +rep for the last bit of help :)
 

Vestras

Retired
Reaction score
248
Ehh, don't know what regex system you're using but to me that regex looks kind of wrong and basically could go wrong if the input was <title></title> I would find some way to check for <title> even though it was capitalized.

This is how I'd do it:
\<title\>\s*(?<input>.+)\s*\</title\>

Then, if you're using C#, do this:
Code:
Match match = Regex.Match(inputString, pattern);
// To access the input group:
string input = match.Groups["input"].Value;

Another time just test your regexes with a program called RegexBuddy, it's very good.
 

master maste

New Member
Reaction score
32
Maybe it looks wrong because its not working?

Also I'm trying to exclude the <TITLE></TITLE> tags and just get whats in between them. - I understand thats what your "input" group was for, but I'm using look arounds so thats not really necessary

Hmm messing around with your one but c# does not recognize the escape sequence \s...

edit: oh yeah thanks for RegexBuddy, just downloaded it and going to mess around with it tonight and see if I can get something.

edit2:Ok heres what I've got so far (note this works perfectly in RegexBuddy, but not properly in c# even though the same html code is being used.):
Code:
Regex regexObj = new Regex("<(title)>(.*?)<(/title)>");
                Match matchResults = regexObj.Match(WebScreen);
                while (matchResults.Success)
                {
                    display.AppendText(matchResults.Value + "\r\n");
                    matchResults = matchResults.NextMatch();
                }

c# will only pick up the title when the regex code is as follows:
Code:
Regex regexObj = new Regex("<(title)>");
For some reason it wont let me add the closing title tag or anything in between, but I can do all 3 separately...
 

Vestras

Retired
Reaction score
248
Your code is wrong. It should be this:
Code:
Regex regexObj = new Regex(@"<(title)>(.*?)<(/title)>");
                Match matchResults = regexObj.Match(WebScreen);
                while (matchResults.Success)
                {
                    display.AppendText(matchResults.Value + "\r\n");
                    matchResults = matchResults.NextMatch();
                }
 

Icyculyr

I'm a Mac
Reaction score
68
I found this on another forum, it's in VB because I don't know how to do Singleline + IgnoreCase in C#:
Code:
Dim rRegex As New Regex("(?<=<title>).+?(?=</title>)", RegexOptions.Singleline Or RegexOptions.IgnoreCase)
I had to manually remove the NewLine character from it to get rid of the weird symbols, not sure what the equiv. is in C# but I used this in VB:
Code:
Dim sValue As String = mMatch.Value.Replace(ControlChars.NewLine, "")
I'm not sure if you have ControlChars, maybe try Environment.NewLine if you have that or whatever represents NewLine in C#.
 

master maste

New Member
Reaction score
32
Thanks that worked a charm, turns out I needed to add the RexgexOptions.Singleline

Code:
Code:
Regex regexObj = new Regex(@"<(TITLE)>(.*?)<(/TITLE)>", RegexOptions.Singleline);

Thank you guys for your help.

edit: although it works I'm currently trying to figure out how to get rid of the extra spaces before and after the title tags. Don't know why it was formatted like that, not sure how to do it without getting rid of the spaces in between the title words themselves.

edit2: just realized that I don't really need to know how to get rid of the extra spaces, but knowing wouldn't hurt if anyone knew..
 

Icyculyr

I'm a Mac
Reaction score
68
I'd loop from the start of the line until you find the first letter to remove the spaces at the front, for the end it may be a bit more tricky, but try and find the last "letter" by checking say 10-20 spaces ahead if they are all spaces then you can probably remove them (it's unlikely there would be 20 spaces between any word or letter)
 

master maste

New Member
Reaction score
32
just thinking I could potentially not append anything that has 2 spaces next to each other right? since words won't have more than 1 space in between...
 

Icyculyr

I'm a Mac
Reaction score
68
just thinking I could potentially not append anything that has 2 spaces next to each other right? since words won't have more than 1 space in between...
Technically yes.
 

master maste

New Member
Reaction score
32
argghh just did some testing and figured out that c# is using the websites source code and not whats actually on the website, I believe most of it is done via JS.

Makes sense why most of the other tags I tried pulled nothing what so ever since I was running from what was actually on the page elements...

Anyway to make it use whats on the page instead of the source code before it actually uses the javascript?

Must be a way somehow... Thanks for all the help so far anyways :).
 

Icyculyr

I'm a Mac
Reaction score
68
Erm could you turn javascript off on your browser? how are you getting the page contents of the website?
 

master maste

New Member
Reaction score
32
using Firefox, right clicking on page "view page source". But clearly C# is getting the actual source.

edit: I could turn javascript off, but there are error messages displayed on the page when you dont have javascript enabled...
 
General chit-chat
Help Users
  • No one is chatting at the moment.
  • Varine Varine:
    I want to build a filtration system for my 3d printer, and that shit is so much more complicated than I thought it would be
  • Varine Varine:
    Apparently ABS emits styrene particulates which can be like .2 micrometers, which idk if the VOC detectors I have can even catch that
  • Varine Varine:
    Anyway I need to get some of those sensors and two air pressure sensors installed before an after the filters, which I need to figure out how to calculate the necessary pressure for and I have yet to find anything that tells me how to actually do that, just the cfm ratings
  • Varine Varine:
    And then I have to set up an arduino board to read those sensors, which I also don't know very much about but I have a whole bunch of crash course things for that
  • Varine Varine:
    These sensors are also a lot more than I thought they would be. Like 5 to 10 each, idk why but I assumed they would be like 2 dollars
  • Varine Varine:
    Another issue I'm learning is that a lot of the air quality sensors don't work at very high ambient temperatures. I'm planning on heating this enclosure to like 60C or so, and that's the upper limit of their functionality
  • Varine Varine:
    Although I don't know if I need to actually actively heat it or just let the plate and hotend bring the ambient temp to whatever it will, but even then I need to figure out an exfiltration for hot air. I think I kind of know what to do but it's still fucking confusing
  • The Helper The Helper:
    Maybe you could find some of that information from AC tech - like how they detect freon and such
  • Varine Varine:
    That's mostly what I've been looking at
  • Varine Varine:
    I don't think I'm dealing with quite the same pressures though, at the very least its a significantly smaller system. For the time being I'm just going to put together a quick scrubby box though and hope it works good enough to not make my house toxic
  • Varine Varine:
    I mean I don't use this enough to pose any significant danger I don't think, but I would still rather not be throwing styrene all over the air
  • The Helper The Helper:
    New dessert added to recipes Southern Pecan Praline Cake https://www.thehelper.net/threads/recipe-southern-pecan-praline-cake.193555/
  • The Helper The Helper:
    Another bot invasion 493 members online most of them bots that do not show up on stats
  • Varine Varine:
    I'm looking at a solid 378 guests, but 3 members. Of which two are me and VSNES. The third is unlisted, which makes me think its a ghost.
    +1
  • The Helper The Helper:
    Some members choose invisibility mode
    +1
  • The Helper The Helper:
    I bitch about Xenforo sometimes but it really is full featured you just have to really know what you are doing to get the most out of it.
  • The Helper The Helper:
    It is just not easy to fix styles and customize but it definitely can be done
  • The Helper The Helper:
    I do know this - xenforo dropped the ball by not keeping the vbulletin reputation comments as a feature. The loss of the Reputation comments data when we switched to Xenforo really was the death knell for the site when it came to all the users that left. I know I missed it so much and I got way less interested in the site when that feature was gone and I run the site.
  • Blackveiled Blackveiled:
    People love rep, lol
    +1
  • The Helper The Helper:
    The recipe today is Sloppy Joe Casserole - one of my faves LOL https://www.thehelper.net/threads/sloppy-joe-casserole-with-manwich.193585/
  • The Helper The Helper:
    Decided to put up a healthier type recipe to mix it up - Honey Garlic Shrimp Stir-Fry https://www.thehelper.net/threads/recipe-honey-garlic-shrimp-stir-fry.193595/
  • The Helper The Helper:
    Here is another comfort food favorite - Million Dollar Casserole - https://www.thehelper.net/threads/recipe-million-dollar-casserole.193614/

      The Helper Discord

      Staff online

      Members online

      Affiliates

      Hive Workshop NUON Dome World Editor Tutorials

      Network Sponsors

      Apex Steel Pipe - Buys and sells Steel Pipe.
      Top