You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
450 lines
18 KiB
450 lines
18 KiB
3 years ago
|
# Text support
|
||
|
|
||
|
Acedia provides its own `Text` / `MutableText` classes for working with text
|
||
|
that are supposed to replace `string` variables as much as possible.
|
||
|
|
||
|
Main reasons to forgo `string` in favor of custom text types are:
|
||
|
|
||
|
1. `string` does not allow cheap access to either individual characters or
|
||
|
codepoints, which makes computing `string`'s hash too expensive;
|
||
|
2. Expanding `string`'s functionality without introducing new types would
|
||
|
require (for many cases) to disassemble it into codepoints and then to
|
||
|
assemble it back for each transformation;
|
||
|
3. Established way of defining characters' color for `string`s is inconvenient
|
||
|
to work with.
|
||
|
|
||
|
These issues can be resolved with our new text types: `Text` and `MutableText`,
|
||
|
whose only difference is their mutability.
|
||
|
|
||
|
> **NOTE:**
|
||
|
> `Text` and `MutableText` aren't yet in their finished state:
|
||
|
> using them is rather clunky compared to native `string`s and both their
|
||
|
> interface and implementation can be improved. While they already provide some
|
||
|
> important benefits, Acedia's insistence on replacing `string` with `Text` is
|
||
|
> more motivated by its supposed future, rather than current, state.
|
||
|
|
||
|
## `string`
|
||
|
|
||
|
Even if `Text`/`MutableText` are supposed to replace `string` variables, they
|
||
|
still have to be used to either produce `Text`/`MutableText` instances or
|
||
|
to store their values in config files.
|
||
|
This means we have to cover how Acedia deals with `string`s.
|
||
|
|
||
|
### Colored vs plain strings
|
||
|
|
||
|
**Colored strings** are normal UnrealScript `string`s that can contain
|
||
|
4-byte color changing sequences. Whenever some Acedia function takes
|
||
|
a *colored string* these color changing sequences are converted into formatting
|
||
|
information about color of its characters and are not treated
|
||
|
as separate symbols.
|
||
|
|
||
|
> If you are unaware, 4-byte color changing sequences are defined as
|
||
|
> `<0x1b><red_byte><green_byte><blue_byte>` and they allow to color text that is
|
||
|
> being displayed by several native UnrealScript functions.
|
||
|
> For example, `string` that is defined as
|
||
|
> `"One word is colored" @ Chr(0x1b) $ Chr(1) $ Chr(255) $ Chr(1) $ "green"`
|
||
|
> will be output in game's console with its last word colored green.
|
||
|
> Red and blue bytes are taken as `1` instead of `0` because putting zero
|
||
|
> inside break the `string`. `10` is another value that leads to unexpected
|
||
|
> results and should be avoided.
|
||
|
|
||
|
**Plain strings** are `string`s for which all contents are treated as their own
|
||
|
symbols.
|
||
|
If you pass a `string` with 4-byte color changing sequence to some method as
|
||
|
a *plain string*, these 4 bytes will also be treated as characters and
|
||
|
no color information will be extracted as a result.
|
||
|
|
||
|
Plain strings are generally handled faster than colored strings.
|
||
|
|
||
|
### Formatted strings
|
||
|
|
||
|
Formatted `string`s are Acedia's addition and allow to define color information
|
||
|
in a more human-readable way than *colored strings*.
|
||
|
|
||
|
To mark some part of a `string` to have a particular color you need to enclose
|
||
|
it into curly braces `{}`, specify color right after the opening brace
|
||
|
(without any spacing), then, after a single whitespace, must follow
|
||
|
the colored content.
|
||
|
For example, `"Each of these will be colored appropriately: {#ff0000 red}, {#00ff00 green}, {#0000ff blue}!"`
|
||
|
will correspond to a line
|
||
|
`Each of these will be colored appropriately: red, green, blue!`
|
||
|
and only three words representing colors will have any color defined for them.
|
||
|
|
||
|
Color can be specified not only in hex format, but in also in one of
|
||
|
the more readable ways: `rgb(255,0,0)`, `rgb(r=0,G=255,b=255)`,
|
||
|
`rgba(r=45,g=167,b=32,a=200)`.
|
||
|
Or even using color aliases:
|
||
|
`"Each of these will be colored appropriately: {$red red}, {$green green}, {$blue blue}!"`.
|
||
|
|
||
|
These formatting blocks can also be folded into each other:
|
||
|
`"Here {$purple is mostly purple, but {$red some parts} are {$yellow different} color}."`
|
||
|
with an arbitrary depth.
|
||
|
|
||
|
### Conversion
|
||
|
|
||
|
Various types of `string`s can be converted between each other by using
|
||
|
`Text` class, but do note that *formatted strings* can contain more information
|
||
|
than *colored strings* (since latter cannot simply close the colored segment)
|
||
|
and both of them can contain more information than *plain strings*, so
|
||
|
such conversion can lead to information loss.
|
||
|
Examples of conversion:
|
||
|
|
||
|
```unrealscript
|
||
|
local Text auxiliary;
|
||
|
auxiliary = _.text.FromFormattedString("{$gold Hello}, {$crimson world}!");
|
||
|
// Produces a string colored with 4-byte codes, a native way for UnrealScript
|
||
|
auxiliary.ToColoredString();
|
||
|
// Strings all color and produces "Hello, world!"
|
||
|
auxiliary.ToPlainString();
|
||
|
// Don't forget the cleanup!
|
||
|
_.memory.Free(auxiliary);
|
||
|
```
|
||
|
|
||
|
## `Character`
|
||
|
|
||
|
`Character` describes a single symbol of a string and is a smallest text element
|
||
|
that can be returned from a `string` by Acedia's methods.
|
||
|
It contains data about what symbol it represents and what color it has.
|
||
|
`Character` can also be considered invalid, which means that it does not
|
||
|
represent any valid symbol. Validity can be checked with
|
||
|
`_.text.IsValidCharacter()` method.
|
||
|
|
||
|
`Character` is defined as a structure with public fields
|
||
|
(necessary for the implementation), but you should not access them directly
|
||
|
if you wish for your code to stay compatible with future versions of Acedia.
|
||
|
|
||
|
### `Formatting`
|
||
|
|
||
|
Formatting describes how character should be displayed, which currently
|
||
|
corresponds to simply it's color (or the lack of it).
|
||
|
Formatting of a character can be accessed through
|
||
|
`_.text.GetCharacterFormatting()` method and changed
|
||
|
with `_.text.SetFormatting()`.
|
||
|
|
||
|
It is a structure that contains two public fields, which can be freely accessed
|
||
|
(unlike `Character`'s fields):
|
||
|
|
||
|
1. `isColored`: defines whether `Character` is even colored.
|
||
|
2. `color`: color of the `Character`. Only used if `isColored == true`.
|
||
|
|
||
|
## `Text` and `MutableText`
|
||
|
|
||
|
`Text` is an `AcediaObject` that must be appropriately allocated
|
||
|
(also deallocated) and is used by Acedia as substitute for a `string`.
|
||
|
It's contents are immutable: you can expect that they will not change if you
|
||
|
pass a `Text` as an argument to some method, although the whole object can
|
||
|
be deallocated.
|
||
|
`MutableText` is a child class of a `Text` that can change its own contents.
|
||
|
|
||
|
To create either of them you can use `TextAPI` methods:
|
||
|
`_.text.Empty()` to create empty mutable text,
|
||
|
`_.text.FromString()` / `_.text.FromStringM()` to create immutable/mutable
|
||
|
text variants from a plain `string` and their analogues
|
||
|
`_.text.FromColoredString()` / `_.text.FromColoredStringM()` /
|
||
|
`_.text.FromFormattedString()` / `_.text.FromFormattedStringM()`
|
||
|
for colored and formatted `string`s.
|
||
|
|
||
|
You can also get a `string` back by calling either of
|
||
|
`self.ToPlainString()` / `self.ToColoredString()` / `self.ToFormattedString()`
|
||
|
methods.
|
||
|
|
||
|
To duplicate `Text` / `MutableText` themselves you can use `Copy()`
|
||
|
for immutable copies and `MutableCopy()` for mutable ones.
|
||
|
|
||
|
## Defining `Text` / `MutableText` constants
|
||
|
|
||
|
The major drawback of `Text` is how inconvenient it is to use it, compared to
|
||
|
simple string literals. It needs to be defined, allocated, used and
|
||
|
then deallocated:
|
||
|
|
||
|
```unrealscript
|
||
|
local Text message;
|
||
|
message = _.text.FromString("Just some message to y'all!");
|
||
|
_.console.ForAll().WriteLine(message)
|
||
|
.FreeSelf(); // Freeing console writer
|
||
|
message.FreeSelf(); // Freeing message
|
||
|
```
|
||
|
|
||
|
which can lead to some boilerplate code. Unfortunately, at this moment not much
|
||
|
can be done about this boilerplate. An ideal way to work with text literals
|
||
|
right now is to create `Text` instances with all the necessary text constants on
|
||
|
initialization and then use them:
|
||
|
|
||
|
```unrealscript
|
||
|
class SomeClass extends AcediaObject;
|
||
|
|
||
|
var Text MESSAGE, SPECIAL;
|
||
|
|
||
|
protected function StaticConstructor()
|
||
|
{
|
||
|
default.MESSAGE = _.text.FromString("Just some message to y'all!");
|
||
|
default.SPECIAL = _.text.FromString("Only for special occasions!");
|
||
|
}
|
||
|
|
||
|
public final function DoSend()
|
||
|
{
|
||
|
_.console.ForAll().WriteLine(MESSAGE).FreeSelf();
|
||
|
}
|
||
|
|
||
|
public final function DoSendSpecial()
|
||
|
{
|
||
|
_.console.ForAll().WriteLine(SPECIAL).FreeSelf();
|
||
|
}
|
||
|
```
|
||
|
|
||
|
Acedia also pre-defines `stringConstants` array that will be automatically
|
||
|
converted into an array of `Text`s that can later be accessed by their indices
|
||
|
through the `T()` method:
|
||
|
|
||
|
```unrealscript
|
||
|
class SomeClass extends AcediaObject;
|
||
|
|
||
|
var int TMESSAGE, TSPECIAL;
|
||
|
|
||
|
public final function DoSend()
|
||
|
{
|
||
|
_.console.ForAll().WriteLine(T(TMESSAGE)).FreeSelf();
|
||
|
}
|
||
|
|
||
|
public final function DoSendSpecial()
|
||
|
{
|
||
|
_.console.ForAll().WriteLine(T(TSPECIAL)).FreeSelf();
|
||
|
}
|
||
|
|
||
|
defaultproperties
|
||
|
{
|
||
|
TMESSAGE = 0
|
||
|
stringConstants(0) = "Just some message to y'all!"
|
||
|
TSPECIAL = 1
|
||
|
stringConstants(1) = "Only for special occasions!"
|
||
|
}
|
||
|
```
|
||
|
|
||
|
This way of doing things is a bit more cumbersome, but is also safer in
|
||
|
the sense that `T()` will automatically allocate a new `Text` instance should
|
||
|
someone deallocate previous one:
|
||
|
|
||
|
```unrealscript
|
||
|
local Text oldOne, newOne;
|
||
|
oldOne = T(TMESSAGE);
|
||
|
// `T()` returns the same instance of `Text`
|
||
|
TEST_ExpectTrue(oldOne == T(TMESSAGE))
|
||
|
// Until we deallocate it...
|
||
|
oldOne.FreeSelf();
|
||
|
// ...then it creates and returns newly allocated `Text` instance
|
||
|
newOne = T(TMESSAGE);
|
||
|
TEST_ExpectTrue(newOne.IsAllocated());
|
||
|
|
||
|
// This assertion *might* not actually be correct, since `newOne` can be
|
||
|
// just an `oldOne`, reallocated from the object pool.
|
||
|
// TEST_ExpectFalse(oldOne == newOne);
|
||
|
```
|
||
|
|
||
|
### An easier way
|
||
|
|
||
|
While you should ideally define `Text` constants, setting them up can
|
||
|
get annoying.
|
||
|
To alleviate this issue Acedia provides three more methods for quickly
|
||
|
converting `string`s into `Text`: `P()` for plain `string`s,
|
||
|
`C()` for colored `string`s and `F()` for formatted `string`s.
|
||
|
With them out `SomeClass` can be rewritten as:
|
||
|
|
||
|
```unrealscript
|
||
|
class SomeClass extends AcediaObject;
|
||
|
|
||
|
public final function DoSend()
|
||
|
{
|
||
|
_.console.ForAll().WriteLine(P("Just some message to y'all!")).FreeSelf();
|
||
|
}
|
||
|
|
||
|
public final function DoSendSpecial()
|
||
|
{
|
||
|
_.console.ForAll().WriteLine(P("Only for special occasions!")).FreeSelf();
|
||
|
}
|
||
|
```
|
||
|
|
||
|
They do not endlessly create `Text` instances, since they cache and reuse
|
||
|
the ones they return for the same `string`:
|
||
|
|
||
|
```unrealscript
|
||
|
local Text firstInstance;
|
||
|
firstInstance = F("{$purple Some} {$red colored} {$yellow text}.");
|
||
|
// `F()` returns the same instance for the same `string`
|
||
|
TEST_ExpectTrue( firstInstance
|
||
|
== F("{$purple Some} {$red colored} {$yellow text}."));
|
||
|
// But not for different one
|
||
|
TEST_ExpectFalse(firstInstance == F("Some other string"));
|
||
|
// Still the same
|
||
|
TEST_ExpectTrue( firstInstance
|
||
|
== F("{$purple Some} {$red colored} {$yellow text}."));
|
||
|
```
|
||
|
|
||
|
Ideally one would at some point replace these calls with pre-defined constants,
|
||
|
but if you're using only a small amount of literals in your class,
|
||
|
then relying on them should be fine. However avoid using them for
|
||
|
an arbitrarily large amounts of `string`s, since as cache's size grows,
|
||
|
these methods will become increasingly less efficient:
|
||
|
|
||
|
```unrealscript
|
||
|
// The more you call this method with different arguments, the worse
|
||
|
// performance gets since `C()` has to look `string`s up in
|
||
|
// larger and larger cache.
|
||
|
public function DisplayIt(string message)
|
||
|
{
|
||
|
// This is bad, don't do this
|
||
|
_.console.ForAll().WriteLine(C(message)).FreeSelf();
|
||
|
}
|
||
|
```
|
||
|
|
||
|
## Parsing
|
||
|
|
||
|
Acedia provides some parsing functionality through a `Parser` class:
|
||
|
it must first be initialized by either `Initialize()` or `InitializeS()` method
|
||
|
(the only difference whether they take `Text` or `string` as a parameter)
|
||
|
and then it can parse passed contents by consuming its symbols from
|
||
|
the beginning to the end.
|
||
|
|
||
|
For that it provides a set of *matcher methods* that try to read certain values
|
||
|
from the input.
|
||
|
For example, following can parse a color, defined in a hex format:
|
||
|
|
||
|
```unrealscript
|
||
|
local Parser parser;
|
||
|
local int redComponent, greenComponent, blueComponent;
|
||
|
parser = _.text.ParseString("#23a405");
|
||
|
parser.MatchS("#").MUnsignedInteger(redComponent, 16, 2)
|
||
|
.MUnsignedInteger(greenComponent, 16, 2)
|
||
|
.MUnsignedInteger(blueComponent, 16, 2);
|
||
|
// These should be correct values
|
||
|
TEST_ExpectTrue(redComponent == 35);
|
||
|
TEST_ExpectTrue(greenComponent == 164);
|
||
|
TEST_ExpectTrue(blueComponent == 5);
|
||
|
```
|
||
|
|
||
|
Here `MatchS()` matches an exact `string` constant and `MUnsignedInteger()`
|
||
|
matches an unsigned number (with base `16`) of length `2`, recording parsed
|
||
|
value into its first argument.
|
||
|
|
||
|
Another example of parsing a color in format `rgb(123, 135, 2)`:
|
||
|
|
||
|
```unrealscript
|
||
|
local Parser parser;
|
||
|
local int redComponent, greenComponent, blueComponent;
|
||
|
parser = _.text.ParseString("RGB( 123,135 , 2)");
|
||
|
parser.MatchS("rgb(", SCASE_INSENSITIVE).Skip()
|
||
|
.MInteger(redComponent).Skip().MatchS(",").Skip()
|
||
|
.MInteger(greenComponent).Skip().MatchS(",").Skip()
|
||
|
.MInteger(blueComponent).Skip().MatchS(")");
|
||
|
// These should be correct values
|
||
|
TEST_ExpectTrue(redComponent == 123);
|
||
|
TEST_ExpectTrue(greenComponent == 135);
|
||
|
TEST_ExpectTrue(blueComponent == 2);
|
||
|
TEST_ExpectTrue(parser.Ok());
|
||
|
```
|
||
|
|
||
|
where `MInteger()` matches any decimal integer and then records that integer
|
||
|
into the first argument. `Skip()` matches a sequence of whitespaces of
|
||
|
an arbitrary length, adding some these calls allows this code to parse colors
|
||
|
defined with spacings between numbers and other characters like
|
||
|
`rgb( 12, 13 , 107 )`. `Ok()` method simply confirms that all matching calls
|
||
|
so far have succeeded.
|
||
|
|
||
|
If you are unsure in which format the color was defined, then you can use
|
||
|
`Parser`'s methods for remembering/restoring a successful state:
|
||
|
you can first call `parser.Confirm()` to record that all the parsing so far
|
||
|
was successful and should not be discarded, then try to parse hex color.
|
||
|
After that:
|
||
|
|
||
|
* If parsing was successful, - `parser.Ok()` check will return `true` and
|
||
|
you can call `parser.Confirm()` again to mark this new state as one that
|
||
|
shouldn't be discarded.
|
||
|
* Otherwise you can call `parser.R()` to reset your `parser` to the state it
|
||
|
was at the last `parser.Confirm()` call
|
||
|
(or the initial state if no `parser.Confirm()` calls were made)
|
||
|
and try parsing the color in some other way.
|
||
|
|
||
|
```unrealscript
|
||
|
local Parser parser;
|
||
|
local int redComponent, greenComponent, blueComponent;
|
||
|
...
|
||
|
// Suppose we've successfully parsed something and
|
||
|
// need to parse color in one of the two forms next,
|
||
|
// so we remember the current state
|
||
|
parser.Confirm(); // This won't do anything if `parser` has already failed
|
||
|
// Try parsing color in it's rgb-form;
|
||
|
// It's not a major issue to have this many calls before checking for success,
|
||
|
// since once one of them has failed - others won't even try to do anything.
|
||
|
parser.MatchS("rgb(", SCASE_INSENSITIVE).Skip()
|
||
|
.MInteger(redComponent).Skip().MatchS(",").Skip()
|
||
|
.MInteger(greenComponent).Skip().MatchS(",").Skip()
|
||
|
.MInteger(blueComponent).Skip().MatchS(")");
|
||
|
// If we've failed - try hex representation
|
||
|
if (!parser.Ok())
|
||
|
{
|
||
|
parser.R().MatchS("#")
|
||
|
.MUnsignedInteger(redComponent, 16, 2)
|
||
|
.MUnsignedInteger(greenComponent, 16, 2)
|
||
|
.MUnsignedInteger(blueComponent, 16, 2);
|
||
|
}
|
||
|
// It's fine to call `Confirm()` without checking for success,
|
||
|
// since it won't do anything for a parser in a failed state
|
||
|
parser.Confirm();
|
||
|
```
|
||
|
|
||
|
>You can store even more different parser states with
|
||
|
`GetCurrentState()` / `RestoreState()` methods.
|
||
|
In fact, these are the ones used inside a lot of Acedia's methods to avoid
|
||
|
changing main `Parser`'s state that user can rely on.
|
||
|
|
||
|
For more details and examples see the source code of `Parser.uc` or
|
||
|
any Acedia source code that uses `Parser`s.
|
||
|
|
||
|
## JSON support
|
||
|
|
||
|
> **NOTE:**
|
||
|
> This section is closely linked with [Collections](../API/Collections.md).
|
||
|
|
||
|
Acedia's text capabilities also provide limited JSON support.
|
||
|
That is, Acedia can display some of it's types as JSON and parse any valid JSON
|
||
|
into its types/collections, but it does not guarantee verification of whether
|
||
|
parsed JSON is valid and can also accept some technically invalid JSON.
|
||
|
|
||
|
Main methods for these tasks are `_.json.Print()`/`_.json.PrettyPrint()` and
|
||
|
`_.json.ParseWith()`, but there are some more type-specialized methods as well.
|
||
|
Here are the current rules of conversion from JSON to Acedia's types via
|
||
|
`_.json.ParseWith()`:
|
||
|
|
||
|
1. Null values will be returned as `none`;
|
||
|
2. Number values will be return as an `IntBox`/`IntRef` if they consist
|
||
|
of only digits (and optionally a sign) and `FloatBox`/`FloatRef`
|
||
|
otherwise. Choice between box and ref is made based on
|
||
|
`parseAsMutable` parameter (boxes are immutable, refs are mutable);
|
||
|
3. String values will be parsed as `Text`/`MutableText`, based on
|
||
|
`parseAsMutable` parameter;
|
||
|
4. Array values will be parsed as a `DynamicArray`, it's items parsed
|
||
|
according to these rules (`parseAsMutable` parameter is propagated).
|
||
|
5. Object values will be parsed as a `AssociativeArray`, it's items
|
||
|
parsed according to these rules (`parseAsMutable` parameter is
|
||
|
propagated) and recorded under the keys parsed into `Text`.
|
||
|
|
||
|
And printing with `_.json.Print()`/`_.json.PrettyPrint()` follows
|
||
|
symmetrical rules:
|
||
|
|
||
|
1. `none` is printed into "null";
|
||
|
2. Boolean types (`BoolBox`/`BoolRef`) are printed into JSON bool value;
|
||
|
3. Integer (`IntBox`/`IntRef`) and float (`FloatBox`/`FloatRef`) types
|
||
|
are printed into JSON number value;
|
||
|
4. `Text` and `MutableText` are printed into JSON string value;
|
||
|
5. `DynamicArray` is printed into JSON array with `Print()` method
|
||
|
applied to each of its items. If some of them have not printable
|
||
|
types - "none" will be used for them as a replacement.
|
||
|
6. `AssociativeArray` is printed into JSON object with `Print()` method
|
||
|
applied to each of it's items. Only items with `Text` keys are
|
||
|
printed, the rest is omitted. If some of them have not printable
|
||
|
types - "none" will be used for them as a replacement.
|
||
|
|
||
|
The difference between `_.json.Print()` and `_.json.PrettyPrint()` is that
|
||
|
`_.json.Print()` prints out a minimal, compact json, while
|
||
|
`_.json.PrettyPrint()` prints a more human-readable JSON with indentation and
|
||
|
color highlights.
|