Merge pull request #218 from sailpoint-oss/feature/docs-transform-decomposediacriticalmarks

Added info about decomposediacriticalmarks transform.
2025-12-06 04:19:31 +00:00 · 2023-04-14 09:42:28 -05:00
parent 0f5971b0a6 796ab2ac24
commit fa4d9a98b1
1 changed files with 19 additions and 0 deletions
--- a/products/idn/docs/identity-now/transforms/operations/decompose-diacritical-marks.md
+++ b/products/idn/docs/identity-now/transforms/operations/decompose-diacritical-marks.md
@@ -21,6 +21,10 @@ The following are examples of diacritical marks:
 > - Ň
 > - Ŵ

+The decomposeDiacriticalMarks transform uses the [Normalizer library](https://docs.oracle.com/javase/7/docs/api/java/text/Normalizer.html) to decompose the diacritical marks.  It specifically uses the Normalization Form KD (NFKD), as described in Sections 3.6, 3.10, and 3.11 of the Unicode Standard, also summarized under [Annex 4: Decomposition](https://www.unicode.org/reports/tr15/tr15-23.html#Decomposition).
+
+After decomposition, the transform uses a [Regex Replace](https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html) to replace all diacritical marks by using the `InCombiningDiacriticalMarks` property of Unicode (ex. `replaceAll("[\\p{InCombiningDiacriticalMarks}]", "")`).
+
 ## Transform Structure

 The transform for decompose diacritical marks requires only the transform's `type` and `name` attributes:
@@ -88,3 +92,18 @@ Output: "Dubcek"
  "name": "Decompose Diacritical Marks Transform"
 }
 ```
+
+## Testing
+
+To run some tests in code, use this java code to compare the results of what the transform does to what your code does: 
+
+```java
+import java.text.Normalizer;
+import java.util.regex.Pattern;
+
+// Decomposes characters from their diacritical marks
+input = Normalizer.normalize(input, Normalizer.Form.NFKD);
+
+// Removes the marks
+input = input.replaceAll("[\\p{InCombiningDiacriticalMarks}]", "");
+```