diff --git a/README.md b/README.md index 4731a09..9df9633 100644 --- a/README.md +++ b/README.md @@ -24,6 +24,7 @@ Some of the supported features are: - Basic arithmetical operations(`+`, `-`, `*`, `/`, `^`, `%`); - Scientific notation support(`5e3` -> `5000`); - Trigonometrical functions(`sin`, `cos`, `tan`, `asin`, `acos`, `atan`); +- Statistical functions(permutations, combinations, summation, sum of squares, mean, standard deviation, linear regression); - Base conversion(binary: `pb`, octal: `po`, hexadecimal: `px`); - Factorial and constants(`!`, `pi`, `e`); - Random number generator(`@`); @@ -285,6 +286,14 @@ lV 10 M FF { 2 :A # Red [ [ RGB( ] P 2 ;A P lc p. 1 ;A P lc p. 0 ;A P [ ) ] p. [ = ] p. lV ph ] x ``` +19. Find the mean of the following temperatures(Celsius): `[25, 15, 9.5, 10, 20, 16, 20]`: +``` +4 k +25 15 9.5 10 20 16 20 +SX SX SX SX SX SX SX +gM p # Prints 16.5000 +``` + ## License [GPLv3](https://choosealicense.com/licenses/gpl-3.0/) diff --git a/man.md b/man.md index 00c61ab..32c47e7 100644 --- a/man.md +++ b/man.md @@ -3,7 +3,7 @@ title: dc section: 1 header: General Commands Manual footer: Marco Cetica -date: March 26, 2024 +date: March 27, 2024 --- @@ -225,9 +225,10 @@ Pops one value, computes its `acos`, and pushes that. Pops one value, computes its `atan`, and pushes that. ## Statistics Operations -**dc** supports various common statistics operations, such as permutations, combinations, mean, standard deviation -summation, sum of squares and linear regression. All statistics functions are limited to **non-negative integers**. -Accumulating functions use the `X` register. +**dc** supports various common statistics operations, such as permutations, combinations, mean, standard deviation, +summation, sum of squares and linear regression. Most statistics functions are limited to **non-negative integers**. +The accumulating functions(such as the mean, the standard deviation and the linear regression) use the `X` and +the `Y` registers. **gP** @@ -241,10 +242,6 @@ Pops two non-negative integers(that is, >=0) and computes `C_{y, x}`, that is th `y` different items taken in quantities of `x` items at a time. No item shall occur more than once in a set and different orders of the same `x` items are not counted separately. The `y` parameter correspond to the second one value popped while the `x` parameter correspond to the first one popped. -**gN** - -Counts the number of accumulated elements of the `X` register's stack and pushes that. - **gs** Computes Σx of the `X` register's stack and pushes that. @@ -261,6 +258,31 @@ Computes x̄(mean) of the `X` register's stack and pushes that. Computes σ(standard deviation) of the `X` register's stack and pushes that. +**gL** + +Computes linear regression of the `X` register's stack and the `Y` register's stack. Linear regression is a +simple statistical model to find a relationship beetween a *dependent variable*(`Y`) and an independent +variable(`X`). This function will compute the following linear equation: +$$ + y = mx + b +$$ + +using the following formulae: + +``` + ( n * ∑(x_i * y_i) ) - ( ∑x_i * ∑y_i ) + m = --------------------------------------- + n * ∑x^2 - (∑x)^2 + + + ∑y_i - (m * ∑x_i) + b = --------------------------------------- + n +``` + +Where **n** is the number of elements of each set. +The results - the _slope_ **m** and the _y-intercept_ **b** - will be pushed onto the stack in that order. + ## Base Conversion **pb** @@ -629,10 +651,18 @@ lV 10 M FF { 2 :A # Red [ [ RGB( ] P 2 ;A P lc p. 1 ;A P lc p. 0 ;A P [ ) ] p. [ = ] p. lV ph ] x ``` +15. Find the mean of the following temperatures(Celsius): `[25, 15, 9.5, 10, 20, 16, 20]`: +``` +4 k +25 15 9.5 10 20 16 20 +SX SX SX SX SX SX SX +gM p # Prints 16.5000 +``` + # AUTHORS The original version of the **dc** command was written by Robert Morris and Lorinda Cherry. This version of **dc** is developed by Marco Cetica. # BUGS -If you encounter any kind of problem, email me at [email@marcocetica.com](mailto:email@marcocetica.com) or open an issue at [https://github.com/ice-bit/dc](https://github.com/ice-bit/dc). +If you encounter any kind of problem, email me at [email@marcocetica.com](mailto:email@marcocetica.com) or open an issue at [https://github.com/ceticamarco/dc](https://github.com/ceticamarco/dc). diff --git a/src/eval.cpp b/src/eval.cpp index 256f1cc..e4b44e5 100644 --- a/src/eval.cpp +++ b/src/eval.cpp @@ -52,6 +52,7 @@ void Evaluate::init_environment() { this->op_factory.emplace("gS", MAKE_UNIQUE_PTR(Statistics, OPType::SUMXX)); this->op_factory.emplace("gM", MAKE_UNIQUE_PTR(Statistics, OPType::MEAN)); this->op_factory.emplace("gD", MAKE_UNIQUE_PTR(Statistics, OPType::SDEV)); + this->op_factory.emplace("gL", MAKE_UNIQUE_PTR(Statistics, OPType::LREG)); // Bitwise operations this->op_factory.emplace("{", MAKE_UNIQUE_PTR(Bitwise, OPType::BAND)); this->op_factory.emplace("}", MAKE_UNIQUE_PTR(Bitwise, OPType::BOR)); diff --git a/src/statistics.cpp b/src/statistics.cpp index 261b313..c53ae99 100644 --- a/src/statistics.cpp +++ b/src/statistics.cpp @@ -15,6 +15,7 @@ std::optional Statistics::exec(dc::Stack &stack, dc::P case OPType::SUMXX: err = fn_sum_squared(stack, parameters, regs); break; case OPType::MEAN: err = fn_mean(stack, parameters, regs); break; case OPType::SDEV: err = fn_sdev(stack, parameters, regs); break; + case OPType::LREG: err = fn_lreg(stack, parameters, regs); break; default: break; } @@ -51,7 +52,7 @@ std::optional Statistics::fn_perm(dc::Stack &stack, co unsigned long long permutation = numerator_opt.value() / denominator_opt.value(); stack.push(trim_digits(permutation, parameters.precision)); } else { - return "'gP' requires integers values"; + return "'gP' requires integer values"; } return std::nullopt; @@ -90,7 +91,7 @@ std::optional Statistics::fn_comb(dc::Stack &stack, co stack.push(trim_digits(combination, parameters.precision)); } else { - return "'gC' requires integers values"; + return "'gC' requires integer values"; } return std::nullopt; @@ -176,7 +177,7 @@ std::optional Statistics::fn_sdev(dc::Stack &stack, co return acc + std::pow(deviation, 2); }); // Then compute the mean of previos values(variance) - auto variance = sum_of_deviations / count; + auto variance = sum_of_deviations / (count-1); // Finally, compute the square root of the variance(standard deviation) auto s_dev = sqrt(variance); @@ -185,6 +186,67 @@ std::optional Statistics::fn_sdev(dc::Stack &stack, co return std::nullopt; } +std::optional Statistics::fn_lreg(dc::Stack &stack, const dc::Parameters ¶meters, std::unordered_map ®s) { + // Check whether 'x' register exists + if(regs.find('X') == regs.end()) { + return "Register 'X' is undefined"; + } + + // Check if register's stack is empty + if(regs['X'].stack.empty()) { + return "The stack of register 'X' is empty"; + } + + // Check whether 'y' register exists + if(regs.find('Y') == regs.end()) { + return "Register 'Y' is undefined"; + } + + // Check if register's stack is empty + if(regs['Y'].stack.empty()) { + return "The stack of register 'Y' is empty"; + } + + // Check that both registers have the same length + if(regs['X'].stack.size() != regs['Y'].stack.size()) { + return "'X' and 'Y' registers must be of the same length"; + } + + // Othwerise, retrieve count and summations of both sets + auto count = regs['X'].stack.size(); + auto x_sum = regs['X'].stack.summation(); + auto y_sum = regs['Y'].stack.summation(); + + // Then compute the sum of products + const auto& x_ref = regs['X'].stack.get_ref(); + const auto& y_ref = regs['Y'].stack.get_ref(); + std::size_t idx = 0; + double sum_of_products = 0.0; + + for(auto it : x_ref) { + auto x = std::stod(it); + auto y = std::stod(y_ref[idx++]); + sum_of_products += (x * y); + } + + // Then compute the sum of squares + auto x_sum_squares = regs['X'].stack.summation_squared(); + + // Then compute the slope of the line(m) + auto slope_numerator = ((count * sum_of_products) - (x_sum * y_sum)); + auto slope_denominator = ((count * x_sum_squares) - std::pow(x_sum, 2)); + auto slope = slope_numerator / slope_denominator; + + // Then compute the intercept of the line(b) + auto intercept = (y_sum - (slope * x_sum)) / count; + + // Finally push the slope and the intercept(in this order) into the main stack + stack.push(trim_digits(slope, parameters.precision)); + stack.push(trim_digits(intercept, parameters.precision)); + + return std::nullopt; +} + std::optional Statistics::factorial(const long long n) { if(n < 0) { return std::nullopt; diff --git a/src/statistics.h b/src/statistics.h index fd8766e..1342f1e 100644 --- a/src/statistics.h +++ b/src/statistics.h @@ -14,6 +14,7 @@ private: std::optional fn_sum_squared(dc::Stack &stack, const dc::Parameters ¶meters, std::unordered_map ®s); std::optional fn_mean(dc::Stack &stack, const dc::Parameters ¶meters, std::unordered_map ®s); std::optional fn_sdev(dc::Stack &stack, const dc::Parameters ¶meters, std::unordered_map ®s); + std::optional fn_lreg(dc::Stack &stack, const dc::Parameters ¶meters, std::unordered_map ®s); std::optional factorial(const long long n); std::string trim_digits(double number, unsigned int precision); diff --git a/tests/test_comb b/tests/test_comb new file mode 100644 index 0000000..27dee5b --- /dev/null +++ b/tests/test_comb @@ -0,0 +1,20 @@ +#!/bin/sh + +utest() { + PROGRAM="$PWD/build/dc" + EXPECTED="276" + ACTUAL=$("$PROGRAM" -e '24 2 gC p') + assert_eq "$EXPECTED" "$ACTUAL" + + # Test empty stack + EXPECTED="'gC' requires two operands" + ACTUAL=$("$PROGRAM" -e 'gC' 2>&1) || true + assert_eq "$EXPECTED" "$ACTUAL" + + # Test non numerical values + EXPECTED="'gC' requires integer values" + ACTUAL=$("$PROGRAM" -e '[ foo ] 5 gC' 2>&1) || true + assert_eq "$EXPECTED" "$ACTUAL" + +} +# vim: ts=4 sw=4 softtabstop=4 expandtab: diff --git a/tests/test_perm b/tests/test_perm new file mode 100644 index 0000000..69c8c3c --- /dev/null +++ b/tests/test_perm @@ -0,0 +1,20 @@ +#!/bin/sh + +utest() { + PROGRAM="$PWD/build/dc" + EXPECTED="2520" + ACTUAL=$("$PROGRAM" -e '7 5 gP p') + assert_eq "$EXPECTED" "$ACTUAL" + + # Test empty stack + EXPECTED="'gP' requires two operands" + ACTUAL=$("$PROGRAM" -e 'gP' 2>&1) || true + assert_eq "$EXPECTED" "$ACTUAL" + + # Test non numerical values + EXPECTED="'gP' requires integer values" + ACTUAL=$("$PROGRAM" -e '[ foo ] 5 gP' 2>&1) || true + assert_eq "$EXPECTED" "$ACTUAL" + +} +# vim: ts=4 sw=4 softtabstop=4 expandtab: diff --git a/tests/test_ravg b/tests/test_ravg new file mode 100644 index 0000000..4e841c6 --- /dev/null +++ b/tests/test_ravg @@ -0,0 +1,20 @@ +#!/bin/sh + +utest() { + PROGRAM="$PWD/build/dc" + EXPECTED="5" + ACTUAL=$("$PROGRAM" -e '1 2 3 4 5 6 7 8 9 SX SX SX SX SX SX SX SX SX gM p cX') + assert_eq "$EXPECTED" "$ACTUAL" + + # Test undefined register + EXPECTED="Register 'X' is undefined" + ACTUAL=$("$PROGRAM" -e 'gM' 2>&1) || true + assert_eq "$EXPECTED" "$ACTUAL" + + # Test empty stack + EXPECTED="The stack of register 'X' is empty" + ACTUAL=$("$PROGRAM" -e '0 5 :X gM' 2>&1) || true + assert_eq "$EXPECTED" "$ACTUAL" + +} +# vim: ts=4 sw=4 softtabstop=4 expandtab: diff --git a/tests/test_rlr b/tests/test_rlr new file mode 100644 index 0000000..c2729f8 --- /dev/null +++ b/tests/test_rlr @@ -0,0 +1,38 @@ +#!/bin/sh + +utest() { + PROGRAM="$PWD/build/dc" + # Test undefined register + EXPECTED="Register 'X' is undefined" + ACTUAL=$("$PROGRAM" -e 'gL' 2>&1) || true + assert_eq "$EXPECTED" "$ACTUAL" + + # Test empty stack + EXPECTED="The stack of register 'X' is empty" + ACTUAL=$("$PROGRAM" -e '0 5 :X gL' 2>&1) || true + assert_eq "$EXPECTED" "$ACTUAL" + + # Test undefined register + EXPECTED="Register 'Y' is undefined" + ACTUAL=$("$PROGRAM" -e '5 SX gL' 2>&1) || true + assert_eq "$EXPECTED" "$ACTUAL" + + # Test empty stack + EXPECTED="The stack of register 'Y' is empty" + ACTUAL=$("$PROGRAM" -e '5 SX 0 5 :Y gL' 2>&1) || true + assert_eq "$EXPECTED" "$ACTUAL" + + # Test stacks of different sizes + EXPECTED="'X' and 'Y' registers must be of the same length" + ACTUAL=$("$PROGRAM" -e '5 SX 5 SY 7 SY gL' 2>&1) || true + assert_eq "$EXPECTED" "$ACTUAL" + + # Test with real values + EXPECTED="1 0" + ACTUAL=$("$PROGRAM" -e "cX cY \ + 1 2 3 4 5 6 7 8 9 SX SX SX SX SX SX SX SX SX \ + 1 2 3 4 5 6 7 8 9 SY SY SY SY SY SY SY SY SY \ + gL r p. r p") + assert_eq "$EXPECTED" "$ACTUAL" +} +# vim: ts=4 sw=4 softtabstop=4 expandtab: diff --git a/tests/test_rsdev b/tests/test_rsdev new file mode 100644 index 0000000..1d93662 --- /dev/null +++ b/tests/test_rsdev @@ -0,0 +1,20 @@ +#!/bin/sh + +utest() { + PROGRAM="$PWD/build/dc" + EXPECTED="2.7386" + ACTUAL=$("$PROGRAM" -e '4 k 1 2 3 4 5 6 7 8 9 SX SX SX SX SX SX SX SX SX gD p cX') + assert_eq "$EXPECTED" "$ACTUAL" + + # Test undefined register + EXPECTED="Register 'X' is undefined" + ACTUAL=$("$PROGRAM" -e 'gD' 2>&1) || true + assert_eq "$EXPECTED" "$ACTUAL" + + # Test empty stack + EXPECTED="The stack of register 'X' is empty" + ACTUAL=$("$PROGRAM" -e '0 5 :X gD' 2>&1) || true + assert_eq "$EXPECTED" "$ACTUAL" + +} +# vim: ts=4 sw=4 softtabstop=4 expandtab: diff --git a/tests/test_rsum b/tests/test_rsum new file mode 100644 index 0000000..5188256 --- /dev/null +++ b/tests/test_rsum @@ -0,0 +1,20 @@ +#!/bin/sh + +utest() { + PROGRAM="$PWD/build/dc" + EXPECTED="45" + ACTUAL=$("$PROGRAM" -e '1 2 3 4 5 6 7 8 9 SX SX SX SX SX SX SX SX SX gs p cX') + assert_eq "$EXPECTED" "$ACTUAL" + + # Test undefined register + EXPECTED="Register 'X' is undefined" + ACTUAL=$("$PROGRAM" -e 'gs' 2>&1) || true + assert_eq "$EXPECTED" "$ACTUAL" + + # Test empty stack + EXPECTED="The stack of register 'X' is empty" + ACTUAL=$("$PROGRAM" -e '0 5 :X gs' 2>&1) || true + assert_eq "$EXPECTED" "$ACTUAL" + +} +# vim: ts=4 sw=4 softtabstop=4 expandtab: diff --git a/tests/test_rsumx b/tests/test_rsumx new file mode 100644 index 0000000..e78757a --- /dev/null +++ b/tests/test_rsumx @@ -0,0 +1,20 @@ +#!/bin/sh + +utest() { + PROGRAM="$PWD/build/dc" + EXPECTED="285" + ACTUAL=$("$PROGRAM" -e '1 2 3 4 5 6 7 8 9 SX SX SX SX SX SX SX SX SX gS p cX') + assert_eq "$EXPECTED" "$ACTUAL" + + # Test undefined register + EXPECTED="Register 'X' is undefined" + ACTUAL=$("$PROGRAM" -e 'gS' 2>&1) || true + assert_eq "$EXPECTED" "$ACTUAL" + + # Test empty stack + EXPECTED="The stack of register 'X' is empty" + ACTUAL=$("$PROGRAM" -e '0 5 :X gS' 2>&1) || true + assert_eq "$EXPECTED" "$ACTUAL" + +} +# vim: ts=4 sw=4 softtabstop=4 expandtab: