Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GOBBLIN-1901] Define MultiActiveLeaseArbiter Decorator to Model Failed Lease Completion #3765

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,11 @@ public class ConfigurationKeys {
public static final String DEFAULT_SCHEDULER_LEASE_DETERMINATION_STORE_DB_TABLE = "gobblin_scheduler_lease_determination_store";
// Refers to the event we originally tried to acquire a lease which achieved `consensus` among participants through
// the database
public static final String MULTI_ACTIVE_LEASE_ARBITER_HOST_TO_BIT_MASK_MAP = MYSQL_LEASE_ARBITER_PREFIX + ".hostToBitMaskMap";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is not interpreted by the MysqlMALeaseArbiter, it shouldn't borrow that class's config prefix. let's make a prefix just for the class reading this config. (that will clue in config maintainers that it's unnecessary once the class itself is no longer being used.)

public static final String MULTI_ACTIVE_LEASE_ARBITER_BIT_MASK_LENGTH = MYSQL_LEASE_ARBITER_PREFIX + ".bitMaskLength";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually BIT_MASK is one word, so BITMASK and bitmask

public static final int DEFAULT_MULTI_ACTIVE_LEASE_ARBITER_BIT_MASK_LENGTH = 4;
public static final String MULTI_ACTIVE_LEASE_ARBITER_TESTING_DECORATOR_NUM_HOSTS = MYSQL_LEASE_ARBITER_PREFIX + ".numHosts";
public static final int DEFAULT_MULTI_ACTIVE_LEASE_ARBITER_TESTING_DECORATOR_NUM_HOSTS = 4;
public static final String SCHEDULER_PRESERVED_CONSENSUS_EVENT_TIME_MILLIS_KEY = "preservedConsensusEventTimeMillis";
// Time the reminder event Trigger is supposed to fire from the scheduler
public static final String SCHEDULER_EXPECTED_REMINDER_TIME_MILLIS_KEY = "expectedReminderTimeMillis";
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
package org.apache.gobblin.runtime.api;

import com.google.common.base.Optional;
import com.typesafe.config.Config;
import java.io.IOException;
import java.net.Inet6Address;
import java.net.InetAddress;
import java.net.UnknownHostException;
import java.util.HashMap;
import javax.inject.Inject;
import lombok.extern.slf4j.Slf4j;
import org.apache.gobblin.configuration.ConfigurationKeys;
import org.apache.gobblin.util.ConfigUtils;
import org.apache.gobblin.util.HostUtils;


/**
* This class is a decorator for {@link MysqlMultiActiveLeaseArbiter} used to model scenarios where a lease owner fails
* to complete a lease intermittently (representing a variety of slowness or failure cases that can result on the
* participant side, network connectivity, or database).
*
* It will fail on calls to {@link MysqlMultiActiveLeaseArbiter.recordLeaseSuccess()} where a function of the lease
* obtained timestamp matches a bitmask of the host. Ideally, each participant should fail on different calls (with
* limited overlap if we want to test that). We use a deterministic method of failing some calls to complete a lease
* success with the following methodology. We take the binary representation of the lease obtained timestamp, scatter
* its bits through bit interleaving of the first and second halves of the binary representation to differentiate
* behavior of consecutive timestamps, and compare the last N digits (determined through config) to the bit mask of the
Comment on lines +25 to +27
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the idea of scattering (to avoid long runs of on/off). while I have very little background in number theory, the idea of interleaving high-order with low-order bits intuitively concerns me... so not sure if I'm missing something.

here's my logic: when representing millis as a java long, we can essentially consider the high-order bits to be constants, since the lowest of the long's high bits [32:63] only changes once every 10^9 secs. after interleaving, wouldn't odd position bits nearly never change, while even-positions would. the assignment:

host0:0001,host1:0010,host2:0100,host3:1000

would give drastically different probability for host1 and host3, than for host0, and host2--right?

instead I suggest bit "scattering" by passing the number through a hash digest, like MD5 or SHA1

* host. If the bitwise AND comparison to the host bit mask equals the bitmask we fail the call.
*/
@Slf4j
public class MysqlMultiActiveLeaseArbiterTestingDecorator extends MysqlMultiActiveLeaseArbiter {
Copy link
Contributor

@phet phet Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the name TestingDecorator suggests nothing about behavior. Decorator seems a relevant pattern, and suggests reusability, but it's more important to name how/why it decorates than to state vaguely that decoration is afoot.

how about FailureInjectingMALeaseArbiter or FailureInjectingMALeaseArbiterDecorator?

Also, on the subject of reuse, why base this specifically on MysqlMALeaseArbiter, rather the MALeaseArbiter interface/base class? as currently written, this is not truly a decorator, but rather an extension to MysqlMALA. (in java a decorator would implement an interface, but almost never extend a concrete derived class that already implements it.)

to be a decorator, the ctor should either take a MALeaseArbiter instance to delegate to or else create one internally. here, I suggest the latter, which means this class (itself named within config) should itself read its own (prefixed) config to learn the class name it should internally create an instance of.

private final int bitMaskLength;
Copy link
Contributor

@phet phet Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking about it, I don't actually believe we need separately to store the length of the bitmask. we can just take whatever value is passed in and use that; e.g.

int bitmaskBits = config.readInt(...)

private final int numHosts;
private final HashMap<Integer, Integer> hostIdToBitMask = new HashMap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't there be only one MALA instance per host? if so, it shouldn't need to worry about any host bitmask but its own, so probably an Optional<Integer>.


@Inject
public MysqlMultiActiveLeaseArbiterTestingDecorator(Config config) throws IOException {
super(config);
bitMaskLength = ConfigUtils.getInt(config, ConfigurationKeys.MULTI_ACTIVE_LEASE_ARBITER_BIT_MASK_LENGTH,
ConfigurationKeys.DEFAULT_MULTI_ACTIVE_LEASE_ARBITER_BIT_MASK_LENGTH);
numHosts = ConfigUtils.getInt(config, ConfigurationKeys.MULTI_ACTIVE_LEASE_ARBITER_TESTING_DECORATOR_NUM_HOSTS,
ConfigurationKeys.DEFAULT_MULTI_ACTIVE_LEASE_ARBITER_TESTING_DECORATOR_NUM_HOSTS);
initializeHostToBitMaskMap(config);
}

/**
* Extract bit mask from input config if one is present. Otherwise set the default bitmask for each host id which
* does not have overlapping bits between two hosts so that a given status will not fail on multiple hosts.
* @param config expected to contain a mapping of host address to bitmap in format
* "host1:bitMask1,host2:bitMask2,...,hostN:bitMaskN"
* Note: that if the mapping format is incorrect or there are fewer than `bitMaskLength` mappings provide we utilize
* the default to prevent unintended consequences of overlapping bit masks.
*/
protected void initializeHostToBitMaskMap(Config config) {
// Set default bit masks for each hosts
// TODO: change this to parse default from Configuration.Keys property or is that unnecessary?
hostIdToBitMask.put(1, 0b0001);
hostIdToBitMask.put(2, 0b0010);
hostIdToBitMask.put(3, 0b0100);
hostIdToBitMask.put(4, 0b1000);
Comment on lines +56 to +60
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems more than unnecessary, and actually confusing. if a host is not named in the config, it's bitmask should match nothing. AFA behavior, such hosts should play it straight and not inject any failure


// If a valid mapping is provided in config, then we overwrite all the default values.
if (config.hasPath(ConfigurationKeys.MULTI_ACTIVE_LEASE_ARBITER_HOST_TO_BIT_MASK_MAP)) {
String stringMap = config.getString(ConfigurationKeys.MULTI_ACTIVE_LEASE_ARBITER_HOST_TO_BIT_MASK_MAP);
Optional<HashMap<InetAddress,Integer>> addressToBitMapOptional = validateStringMap(stringMap, numHosts, bitMaskLength);
if (addressToBitMapOptional.isPresent()) {
for (InetAddress inetAddress : addressToBitMapOptional.get().keySet()) {
hostIdToBitMask.put(getHostIdFromAddress(inetAddress), addressToBitMapOptional.get().get(inetAddress));
}
}
}
}

protected static Optional<HashMap<InetAddress,Integer>> validateStringMap(String stringMap, int numHosts, int bitMaskLength) {
// TODO: Refactor to increase abstraction
String[] hostAddressToMap = stringMap.split(",");
if (hostAddressToMap.length < numHosts) {
log.warn("Host address to bit mask map expected to be in format "
+ "`host1:bitMask1,host2:bitMask2,...,hostN:bitMaskN` with at least " + numHosts + " hosts necessary. Using "
+ "default.");
return Optional.absent();
}
HashMap<InetAddress,Integer> addressToBitmap = new HashMap<>();
for (String mapping : hostAddressToMap) {
String[] keyAndValue = mapping.split(":");
if (keyAndValue.length != 2) {
log.warn("Host address to bit mask map should be separated by `:`. Expected format "
+ "`host1:bitMask1,host2:bitMask2,...,hostN:bitMaskN`. Using default.");
}
Optional<InetAddress> addressOptional = HostUtils.getAddressForHostName(keyAndValue[0]);
if (!addressOptional.isPresent()) {
log.warn("Invalid hostname format in configuration. Using default.");
return Optional.absent();
}
if (!isValidBitMask(keyAndValue[1], bitMaskLength)) {
log.warn("Invalid bit mask format in configuration, expected to be " + bitMaskLength + " digit binary number "
+ "ie: `1010`. Using default.");
return Optional.absent();
}
addressToBitmap.put(addressOptional.get(), Integer.valueOf(keyAndValue[1], 2));
}
return Optional.of(addressToBitmap);
}

protected static boolean isValidBitMask(String input, int bitMaskLength) {
// Check if the string contains only 0s and 1s
if (!input.matches("[01]+")) {
return false;
}
// Check if the string is exactly `bitMaskLength` characters long
if (input.length() != bitMaskLength) {
return false;
}
return true;
}

/**
* Retrieve the host id as a number between 1 through `numHosts` by using the host address's hashcode.
* @return
*/
protected int getHostIdFromAddress(InetAddress address) {
return (address.hashCode() % numHosts) + 1;
}
Comment on lines +117 to +123
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the concept of host ID... please explain.

I'd imagine each host would be identified by its specific hostname and the configured bitmask it should use keyed by that name


/**
* Returns bit mask for given host
* @param hostId
* @return
*/
protected int getBitMaskForHostId(int hostId) {
return this.hostIdToBitMask.get(hostId);
}

/**
* Return bit mask for the current host
*/
protected int getBitMaskForHost() throws UnknownHostException {
return getBitMaskForHostId(getHostIdFromAddress(Inet6Address.getLocalHost()));
}

/**
* Apply a deterministic function to the input status to evaluate whether this host should fail to complete a lease
* for testing purposes.
*/
@Override
public boolean recordLeaseSuccess(LeaseObtainedStatus status) throws IOException {
// Get host bit mask
int bitMask = getBitMaskForHost();
if (shouldFailLeaseCompletionAttempt(status, bitMask, bitMaskLength)) {
log.info("Multi-active lease arbiter lease attempt: [{}, eventTimestamp: {}] - FAILED to complete in testing "
+ "scenario");
return false;
} else {
return super.recordLeaseSuccess(status);
}
}

/**
* Applies bitmask to lease acquisition timestamp of a status parameter provided to evaluate if the lease attempt to
* this host should fail
* @param status {@link org.apache.gobblin.runtime.api.MultiActiveLeaseArbiter.LeaseObtainedStatus}
* @param bitmask 4-bit binary integer used to compare against modified lease acquisition timestamp
* @return true if the host should fail the lease completion attempt
*/
protected static boolean shouldFailLeaseCompletionAttempt(LeaseObtainedStatus status, int bitmask,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likely should be @VisibleForTesting

int bitMaskLength) {
// Convert event timestamp to binary
Long.toString(status.getLeaseAcquisitionTimestamp()).getBytes();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this (since not assigned)

String binaryString = Long.toBinaryString(status.getLeaseAcquisitionTimestamp());
// Scatter binary bits
String scatteredBinaryString = scatterBinaryStringBits(binaryString);
// Take last `bitMaskLength`` bits of the string
int length = scatteredBinaryString.length();
String shortenedBinaryString = scatteredBinaryString.substring(length-bitMaskLength, length);
Copy link
Contributor

@phet phet Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's do binary entirely with numbers (no strings). e.g.:

boolean shouldFail = (scatterTimestamp(ts) & bitmaskBits) != 0

// Apply bitmask
int binaryInt = Integer.valueOf(shortenedBinaryString, 2);
return (binaryInt & bitmask) == bitmask;
}

/**
* Given an input string in binary format, scatter the bits to arrange data in a non-contiguous, deterministic way.
* @param inputBinaryString 64-bit binary string
* @return a binary format String
*/
protected static String scatterBinaryStringBits(String inputBinaryString) {
String firstHalf = inputBinaryString.substring(0, inputBinaryString.length()/2);
String secondHalf = inputBinaryString.substring(inputBinaryString.length()/2);
return String.valueOf(interleaveBits(firstHalf, secondHalf));
}

/**
* Interleave bits of two binary strings of the same length to return a binary format long with interleaved bits
* @param binaryString1 32-bit binary string
* @param binaryString2 32-bit binary string
* @return 64-bit binary format long
*/
protected static long interleaveBits(String binaryString1, String binaryString2) {
long binaryLong1 = Long.parseLong(binaryString1, 2);
long binaryLong2 = Long.parseLong(binaryString2, 2);
long result = 0;
for (int i=0; i < binaryString1.length(); i++) {
result |= ((binaryLong1 & (1 << i)) << i) | ((binaryLong2 & (1 << i)) << (i + 1));
}
return result;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
*/
package org.apache.gobblin.util;

import com.google.common.base.Optional;
import java.net.InetAddress;
import java.net.UnknownHostException;

Expand All @@ -32,4 +33,16 @@ public static String getHostName() {
public static String getPrincipalUsingHostname(String name, String realm) {
return name + "/" + getHostName() + "@" + realm;
}

/**
* Given a host name return an optional containing the InetAddress object if one can be constructed for the input
* // TODO: provide details about expected hostName format
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be no need to stipulate, given the type

... perhaps rename the function to getInetAddressForHostName?

*/
public static Optional<InetAddress> getAddressForHostName(String hostName) {
try {
return Optional.of(InetAddress.getByName(hostName));
} catch (UnknownHostException e) {
return Optional.absent();
}
}
}
Loading