C Sharp

1628 readers

1 users here now

A community about the C# programming language

Getting started

Useful resources

IDEs and code editors

Visual Studio (Windows/Mac)
Rider (Windows/Mac/Linux)
Visual Studio Code (Windows/Mac/Linux)

Tools

Decompilers: ILSpy, dotPeek
Scratchpad: LINQPad
Online playground and IL viewer: SharpLab

Rules

Rule 1: Follow Lemmy rules
Rule 2: Be excellent to each other, no hostility towards users for any reason
Rule 3: No spam of tools/companies/advertisements

Related communities

c/dotnet

founded 2 years ago

MODERATORS

Ategon

nibblebit

Spyros

-9

How to Use StringPool to Reduce String Allocations in C# - Code Maze (code-maze.com)

submitted 1 year ago by SmartmanApps to c/csharp

3 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] porgamrer 3 points 1 year ago* (last edited 1 year ago) (2 children)

For 99% of use cases this string pool is just slower. Whether intentionally or not, the benchmark code is strange and misleading.

String and StringPool are only slower in the final benchmark because doing 100,000 allocations in a synchronous loop while retaining a reference to each one is the worst case scenario for a generational GC. It forcibly and artificially breaks the generational hypothesis.

Conversely, caching 100,000 samples of the same 16 strings (!!!) is the best possible case for the string pool. It spends zero time in GC because the benchmark code contains this very unrealistic pattern.

Most real code is going to quickly forget intermediate strings and clean them up very cheaply in the nursery generation. If you do need to sample 100,000 substrings in a synchronous loop, you can just use ReadOnlySpan.

There are real use-cases for string caches and tries, but they are pretty rare.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

I think the focus of the article is in highlighting the allocation performance (which is the goal of the StringPool) vs. overall performance (i.e. speed) and so the benchmark, while being artificial, is designed to focus on that specific thing. This is actually pointed out in the article just before showing the benchmark results:

It is important to note that since the focus of StringPool is reducing memory allocation, our main focus in the benchmark is on allocations more than on speed:

I agree that an additional benchmark, showing it in a more real-world scenario could prove helpful, but the existing benchmark does a good job of highlighting the allocation reduction seen when processing large numbers of char data. A more real world example would be something like a file upload validation method which is first checking the file extension against a HashSet<string> of valid extensions. In that scenario we would be able to take the filename as a Span and extract the extension from it as a Span, but we cannot call HashSet.Contains() with a Span, we have to use a string. So that would require calling extensionSpan.ToString(). In this scenario, we could use the StringPool to avoid unnecessary string allocation (while the article does not use this particular example, it does mention other related scenarios).

Overall, as you mention, the real use-cases for string caches (such as StringPool) are pretty rare, it is a niche topic, but for those who need to do something like that, I think the article helps to present an accessible introduction.

[–] SmartmanApps 1 points 1 year ago

Oh ok. Thanks for that extra info. I was wondering why this (apparent) performance tip was getting downvoted, but maybe that's it.